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NOMENCLATURE 


ATTRACTION:   The  pull  or  attracting  power  of  a  zone,  usually 
measured  in  terms  of  number  of  trip  ends.   In  gravity  model 
P&A  terminology,  the  attraction  zone  is: 

1.  For  home  based  trips,  the  zone  corresponding  to 
the  non-home  end  of  the  trip. 

2.  For  non-home  based  trips,  the  zone  of  destination. 

AVERAGE  DAILY  TRAFFIC  (APT) :   The  average  number  of  vehicles 
passing  a  specified  point  during  a  24  hour  period. 

BASE  YEAR:   The  year  to  which  all  survey  data  are  related. 
Usually  the  year  in  which  the  survey  data  were  collected. 

BMD0  2R:   A  widely  used  computer  program  for  performing 
stepwise  multiple  regression  developed  by  the  University  of 
California. 

CLUSTERED  TRIP  DATA  (OR  INFORMATION) :   Trip  data  obtained 
through  a  statistical  sampling  process  wherein  the  unit 
actually  being  sampled  is  not  the  trip  but  some  associated 
unit  such  as  the  vehicle  or  dwelling  unit. 

CODE:   (noun)  A  system  of  symbols  for  representing  informa- 
tion,  (verb)  To  reduce  data  to  a  more  meaningful  form  for 
subsequent  data  processing. 

CONSISTENCY  CHECK:   An  error  check  in  which  the  contents  of 
two  or  more  data  fields  are  mutually  examined  to  insure  that 
they  are  not  contradictory.   The  fields  involved  may  be  in 
the  same  or  different  records.   Sometimes  called  a 
contingency  check. 

CORDON  LINE:   An  imaginary  line  enclosing  a  study  area. 

COUNT :   The  actual  number  of  vehicles  passing  a  specified 
point  during  a  stated  time  period  as  determined  by  human  or 
mechanical  tabulation. 

DEPENDENTLY  CHECK:   To  attempt  to  verify  the  correctness  of 
previously  obtained  codes  by  having  a  second  person  examine 
the  original  codes  and  decide  whether  they  were  correct  or 
not. 

DESTINATION:   The  terminus  of  a  trip. 


DIGITIZATION:   The  process  of  converting  data  to  a  form 
which  is  directly  readable  by  a  computer.   In  the  past, 
keypunching  has  been  the  digitization  process  most  frequently 
used. 

DIRECTIONAL  LINK:  A  link  in  which  traffic  is  allowed  in  only 
one  direction.  There  may  be,  and  usually  is,  a  corresponding 
link  in  the  opposite  direction. 

DIRECTIONAL  SPLIT:   In  gravity  model  P&A  terminology,  a  pair 
of  numbers  indicating  the  relative  orientation  of  home  based 
trips  to  and  from  the  production  (home)  end.   The  first 
number  is  the  percentage  of  productions  which  actually 
originate  in  the  production  zone  and  terminate  in  the 
attraction  zone.   The  second  number  is  the  percentage  of 
productions  which  actually  terminate  in  the  production  zone 
and  originate  in  the  attraction  zone.   The  two  numbers  sum 
to  100  per  cent  unless  factoring  is  being  performed. 

DWELLING  UNIT  (D.U.):   A  room  or  group  of  rooms  occupied  or 
intended  for  occupancy  as  separate  living  quarters  by  a 
family  or  other  group  of  persons  living  together  or  by  a 
person  living  alone. 

EDIT:   To  perform  the  two  step  process  of  checking  data  for 
errors  and  rectifying  any  errors  which  are  found.   Sometimes 
used  to  mean  solely  the  checking  of  data  for  errors. 

EDITING  PROCEDURE:  A  methodology  for  data  editing  consisting 
of  an  error  checking  step  and  an  error  rectification  step. 

ERROR  CHECK:   A  test  made  to  insure  that  a  given  data  field 
(or  combination  of  data  fields)  conforms  to  predefined 
conditions  of  acceptability. 

ERROR  RECTIFICATION:   The  process  whereby  individual  trip 
cards  or  groups  of  associated  trip  cards  containing  data 
failing  error  checks  are  either  discarded  or  retained.   If 
retained,  methods  of  substituting  data  which  will  pass  the 
error  checks  for  those  data  which  failed  the  error  checks 
must  be  specified.   The  substituted  data  do  not  necessarily 
have  to  be  "correct"  data. 

EXPANSION  FACTOR:   A  multiplication  factor  indicating  the 
number  of  units  in  the  entire  population  which  the  given 
unit  represents. 

EXTERNAL :   An  adjective  indicating  outside  of  the  study  area. 

EXTERNAL  TRIPS:   Those  trips  which  cross  the  cordon  line. 
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FRICTION  FACTORS  (F  FACTORS):   A  set  of  numbers,  one  for 
each  travel  time  increment,  indicating  the  effects  of  travel 
time  upon  number  of  trips.   Used  in  the  gravity  model. 

GEOGRAPHIC  CODING  INDICES:   Reference  material  similar  to 
telephone  directories  used  by  coders  in  determining  in  which 
zone  a  given  trip  end  lies.   The  two  main  types  are  the 
street  address  coding  index  and  the  place-name  coding  index. 
See  Appendix  B  for  examples. 

GRAVITY  MODEL:   A  mathematical  formula  that  distributes 
trips  between  zones  proportional  to  the  attraction  of  the 
destination  zone  and  inversely  proportional  to  some  function 
of  the  separation  between  the  zones. 

HOME  BASED  TRIPS:   Trips  with  either  end  at  the  residence 
(home) . 

HOME  BASED  SHOPPING  (HBS)  TRIPS:   Those  home  based  trips  in 
which  the  purpose  at  the  non-home  end  is  shopping. 

HOME  BASED  WORK  (HBW)  TRIPS:   Those  home  based  trips  in  which 
the  purpose  at  the  non-home  end  is  work. 

IMPEDANCES:   A  series  of  computer  records  containing  the 
resistance  to  travel  (usually  travel  time)  between  each  pair 
of  zones.   Formerly  called  skimmed  trees. 

INCORRECTLY  RECORDED  DATA:   Data  which,  after  passing  through 
any  or  all  of  the  human/mechanical  processes  of  interviewing, 
coding,  or  digitizing,  differs  from  what  it  should  be  based 
upon  the  information  input  to  the  process. 

INDEPENDENTLY  CHECK:   To  verify  the  correctness  of  previously 
obtained  codes  by  assigning  additional  coder (s)  to  recode  the 
data.   These  additional  coders  would  only  be  exposed  to  the 
information  from  which  the  codes  were  derived  and  not  to  the 
original  coding.   After  all  coders  have  completed  their 
tasks,  corresponding  data  items  would  be  compared  to  determine 
whether  the  various  coders  agreed  or  disagreed. 

INTERNAL:   An  adjective  indicating  inside  of  the  study  area. 

INTRAS:   The  number  of  intrazonal  trips. 

INTRAZONAL  TRAVEL  TIME:   The  average  travel  time  for  trips 
beginning  and  ending  in  the  same  zone,  excluding  terminal 
time. 

LINK:   A  section  of  street  or  highway  identified  by  the  nodes 
at  its  ends.   A  link  may  be  one-way  or  two-way. 


Xll 


LINKED  TRIPS:   Those  trips  remaining  after  application  cf  a 
process  whereby  each  sequence  of  trips  with  intermediate 
purposes  of  serve  passenger  and/or  change  mode  is  reduced  to 
a  single  trip  representing  the  "true"  origin  and  destination. 

LOADING  (THE  NETWORK) :   The  computer  process  of  accumulating 
loads  upon  each  link  in  the  network  as  trips  from  a  triptable 
matrix  are  assigned  to  their  associated  routes. 

ZZkZS      li:;y    :?.  :;z~;-;z?:/.    -.      Z'r.e   volume    assirr.ei   on    an    individual 

link  or  the  collection  of  volumes  assigned  on  all  links  in  a 
network  as  a  result  of  loading  the  network. 

LOCAL  LINK:   A  link  representing  the  local  streets  which  has 
as  one  node  a  centroid  (that  point  in  a  zone  which  is  used 
to  load  all  trips  to  and  from  that  zone) . 

MARK  SENSE  READER:   A  mechanical  device  which  digitizes  scan 
sheets.   The  output  media  is  a  punched  card. 

MULTIPLE  PUNCHES:   The  technique  by  which  a  single  column  on 
a  data  card  is  used  for  recording  two  (or  more)  separate 
data  items.   Normally,  one  data  item  will  be  a  number  and 
the  second  will  be  something  such  as  sex  which  could  be 
indicated  by  the  presence  or  absence  of  an  additional  punch 
in  row  eleven  or  row  twelve  of  the  data  card. 

NETWORK :   The  collection  of  links  and  nodes  by  which  the 
major  components  of  the  street  system  within  the  study  area 
are  defined  to  the  computer. 

NODE :   A  numbered,  identifiable  point  in  a  network  at  the 
junction  of  two  or  more  links. 

NON-HOME  BASED  (NHB)  TRIPS:   Trips  with  neither  end  at  the 
residence  (home) . 

ORIGIN:   The  point  at  which  a  trip  commences. 

ORIGIN  AND  DESTINATION  (O&D)  DATA:   A  collection  of  data 
indicating  the  true  origins  and  destinations  of  trips. 

PER  CENT  ROOT  MEAN  SQUARE  ERROR:   A  statistical  measure  of 
the  differences  between  two  sets  of  corresponding  numeric 
data,  one  set  of  which  is  called  the  base  data  against  which 
the  other  set,  called  the  test  data,  is  compared.   If  there 
are  n  data  values  in  each  set  and  b.  and  t  are  the  values 
of  the  i-th  pair  of  base  and  test   data,   respectively, 
then: 
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PLACE-NAME  CODING  INDEX:   A  geographic  coding  index  consisting 
of  a  list  of  locations  commonly  known  by  their  names  and  the 
associated  zone  numbers. 

PROCEDURE  A:   As  defined  in  this  research  project,  the 
editing  procedure  (or  the  travel  survey  data  after  application 
of  the  editing  procedure)  whereby  no  error  checks  were  made 
and  thus  no  error  rectification  was  needed. 

PROCEDURE  B;   As  defined  in  this  research  project,  the 
editing  procedure  (or  the  travel  survey  data  after  application 
of  the  editing  procedure)  whereby  extensive  error  checks  were 
made  and  all  samples  containing  errors  were  discarded  in  their 
entirity. 

PROCEDURE  C:   As  defined  in  this  research  project,  the 
editing  procedure  (or  the  travel  survey  data  after  applica- 
tion of  the  editing  procedure)  whereby  extensive  error 
checks  were  performed,  all  possible  corrections  were  made 
to  the  data  in  error,  and  the  few  remaining  samples  which 
had  errors  which  could  not  be  corrected  were  discarded  in 
their  entirity. 

PRODUCTION:   The  generating  power  of  a  zone,  usually  measured 
in  terms  of  number  of  trip  ends.   In  gravity  model  P&A 
terminology,  the  production  zone  is: 

1.  For  home  based  trips,  the  zone  of  residence. 

2.  For  non-home  based  trips,  the  zone  of  origin. 

PRODUCTION  AND  ATTRACTION  (P&A)  DATA:   A  collection  of  trip 
data  or  trip  ends  reflecting  the  gravity  model  concept  of 
production  and  attraction  zones  rather  than  the  true  origins 
and  destinations. 

PURPOSE :   The  reason  for  making  a  given  trip.   In  the  travel 
surveys, a  purpose  is  usually  associated  with  each  trip  end. 
For  analysis,  a  single  trip  purpose  is  determined  based  on 
the  two  trip  end  purposes. 

QUALITY  CONTROL:   The  techniques  used  to  assure  accuracy  of 
results  in  the  interviewing,  coding,  and/or  digitizing 
operations. 
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RESTRAINED  LOADS:   The  network  loads  obtained  through  an 
iterative  loading  process  in  which,  at  each  step,  the  routes 
of  travel  are  modified  to  reflect  the  effects  of  the  loads 
on  the  individual  links  as  found  in  the  preceeding  itera- 
tion (s)  . 

SAMPLE:   The  individual  unit  (or  the  data  associated  with  it) 
that  was  statistically  selected  for  interview. 

SCAN  SHEET:   A  special  form  on  which  numeric  data  are  coded 
by  filling  in  with  a  pencil  a  series  of  rectangular  blocks 
each  of  which  is  associated  with  an  individual  numeric  digit 
(see  Figure  A3).   A  mark  sense  reader  is  then  used  to 
digitize  the  data. 

SCREENLINE:   An  imaginary  line,  usually  along  physical 
barriers  such  as  rivers  or  railroad  tracks,  splitting  the 
study  area  into  two  parts. 

SINGLE  FIELD  CHECK:   An  error  check  in  which  the  contents  of 
a  single  data  field  are  examined  without  reference  to  the 
contents  of  any  other  data  fields. 

STAT ION :   A  location  at  the  cordon  line  where  external  survey 
(roadside  interview)  data  or  count  data  were  collected. 

STREET  ADDRESS  CODING  INDEX:   A  geographic  coding  index 
showing  the  zone  numbers  associated  with  the  various  ranges 
of  house  numbers  on  each  street  in  the  study  area. 

TERMINAL  TIME:   That  time  required  to  unpark  (or  park)  and 
the  associated  walking  time  required  to  complete  the  trip 
exclusive  of  the  actual  vehicular  travel  time  between  the 
origin  and  the  destination. 

TRIP:   A  one-direction  movement  which  begins  at  the  origin 
at  the  start  time,  ends  at  the  destination  at  the  arrival 
time,  and  is  conducted  for  a  specific  purpose. 

TRIP  CARDS:   Data  cards  containing  survey-derived  trip  and 
related  information.   Not  every  trip  card  represents  a  trip, 
but  the  data  for  each  individual  trip  is  punched  on  a  single 
trip  card. 

TRIP  END:   Either  a  trip  origin  or  a  trip  destination. 

TRIPTABLE:   A  matrix  in  computer-readable  format  showing  how 
the  trips,  either  O&D  or  P&A,  are  distributed  between  each 
pair  of  zones. 
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UNRESTRAINED  LOADS;   Network  loads  obtained  through  a  single 
loading  without  any  iterations. 

VINE:   A  computer  record  giving  the  path  of  minimum  impedance 
from  a  given  zone  to  all  other  zones.   The  path  is  permitted 
to  cross  over  itself  (e.g.,  three  successive  right  turns  in 
lieu  of  a  single  prohibited  left  turn  are  feasible). 

ZONE:  A  portion  of  the  study  area  which  is  delineated  as 
such  for  the  purpose  of  facilitating  land  use  and  traffic 
analyses. 
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ABSTRACT 


Zimmer,  Ralph  Wildy.   Ph.D.,  Purdue  University,  June 
1972.   CONSEQUENCES  OF  CERTAIN  DATA  ERRORS  IN  THE  TRANSPOR- 
TATION PLANNING  PROCESS.   Major  Professor:   Harold  L. 
Michael. 

Using  unedxted  travel  survey  data  from  the  Evansville, 
Indiana,  transportation  study,  the  major  objectives  of  this 
project  were: 

1.  To  classify  and  enumerate  the  number  of 
incorrectly  recorded  data  items  in  each  of 
the  sets  of  travel  survey  data  (invalid  or 
inconsistent  codes,  unreasonable  magnitudes, 
omitted  items,  etc.  are  classified  as  incor- 
rectly recorded  data) . 

2.  To  determine  the  effects  upon  the 
calibrated  base  year  traffic  assignment, 
trip  distribution,  and  trip  generation 
models  of  following  different  computer 
editing  procedures. 

3.  Based  upon  the  above  results,  to  formulate 
guidelines  for  the  computer  processing  and 
editing  of  travel  survey  data. 

4.  To  determine  the  error  introduced  by 
assuming  a  50-50  directional  split  in 
converting  triptables  from  origin  and 
destination  format  to  production  and 
attraction  format. 

An  editing  procedure  was  defined  as  a  two  step  process: 
error  checking  and  error  rectification.   Error  checking  is 
the  step  in  which  the  individual  data  fields  in  trip  cards 
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are  examined  to  insure  that  they  conform  to  pre-programmed 
criteria  of  acceptability.   Error  rectification  is  the  step 
in  which  some  action  (e.g.,  deletion  or  correction)  is 
taken  relative  to  the  data  which  failed  the  error  checks. 

The  results  obtained  when  using  thoroughly  edited  and 
corrected  survey  data  were  not  significantly  different  than 
the  results  obtained  using  totally  unedited  data.   Signifi- 
cantly different  results  were  obtained,  however,  when  using 
data  which  had  been  thoroughly  edited  and  the  incorrect 
data  discarded. 

Based  on  the  above  results,  recommended  guidelines  for 
computer  processing  and  editing  of  travel  survey  data  were 
formulated.   These  call  for  making  important  single  field 
edits  but  only  a  minimal  number  of  consistency  checks.   All 
incorrect  data  in  samples  of  clustered  trip  information 
should  be  rectified  rather  than  discarded. 

The  review  of  literature  disclosed  a  number  of  factors 
worthy  of  consideration  in  form  and  code  design.   Similarly, 
information  on  keypunching  error  rates  was  found  in  the 
literature  and  supplemented  by  data  obtained  during  the 
course  of  the  project. 

An  assumed  50-50  directional  split  for  ADT  trips  was 
found  to  be  jestified. 


CHAPTER  I 
INTRODUCTION 


Successful  transportation  planning  requires  accurate 
prediction  of  future  travel  demand  and  patterns  of  movement. 
It  has  long  been  recognized  that  such  predictions,  in  order 
to  be  accurate,  must  be  based  upon  a  thorough  understanding 
of  the  existing  travel  patterns  in  the  given  geographic 
area. 

It  has  been  estimated  that  during  just  the  twelve  year 
period  19  58-1970,  approximately  two  hundred  million  dollars 
were  spent  on  urban  transportation  planning  (7)*.   A  major 
portion  of  the  cost  of  the  comprehensive  transportation 
planning  process  as  now  practiced  is  spent  in  the  data 
collection  phase  using  travel  survey  methodology  which  has 
changed  little  during  the  last  twenty-five  years.   Data  on 
internal  trips  of  a  personal  nature  by  residents  of  the 
study  area  are  collected  in  the  home  interview  survey,  data 
on  trips  in  the  study  area  by  non-residents  and  on  trips  into 
and  out  of  the  study  area  by  residents  are  collected  in  the 
external  (roadside  interview)  survey,  and  data  on  truck  and 
taxi  trips  are  collected  in  the  truck  and  taxi  surveys, 
respectively. 

Prior  to  the  mid-1940's,  transportation  planners  were 
primarily  interested  in  determining  the  best  locations  for 
urban  river  crossings  and  bypasses.   To  provide  information 
on  which  to  base  their  decisions,  roadside  interview  origin 
and  destination  (O&D)  studies  were  instigated. 


*The  numbers  in  parenthesis  refer  to  sources  listed  in 
the  List  of  References. 


In  awareness  of  the  growing  transportation  problems  in 
urban  areas,  the  United  States  Congress  passed  the  Federal- 
Aid  Highway  Act  of  1944  making  funds  available  for  highway 
projects  in  urban  areas.   Since  roadside  interviewing 
techniques  were  inadequate  in  providing  the  necessary  data 
about  existing  travel  patterns  within  an  urban  area,  an 
interviewing  procedure  involving  the  sampling  of  a  repre- 
sentative number  of  dwelling  units  within  the  urban  area  was 
developed  cooperatively  by  the  U.S.  Bureau  of  Public  Roads 
(now  the  Federal  Highway  Administration),  the  U.S.  Bureau  of 
the  Census,  and  several  state  highway  departments.   Eight 
cities  conducted  home  interview  surveys  using  these  proce- 
dures during  1944  (59). 

Urban  transportation  planning,  and  thus  travel  surveys, 
were  given  added  impetus  by  the  Federal-Aid  Highway  Act  of 
1962  which  basically  requires  all  urban  areas  of  over  50,000 
population  to  conduct  a  cooperative,  comprehensive,  and 
continuing  transportation  planning  process  in  order  to 
qualify  for  financial  participation  by  the  Federal  Government 
on  those  segments  of  the  Federal-Aid  system  lying  within  the 
urban  area.   To  date,  over  200  urbanized  areas  are  conducting 
such  studies. 

After  travel  survey  data  have  been  collected,  they  are 
coded  and  converted  to  computer-readable  format  (e.g., 
keypunched) .   The  resulting  number  of  computer  card  images 
is  staggering,  ranging  from  tens  of  thousands  of  card  images 
for  the  smallest  study  to  a  few  millions  of  card  images  for 
studies  in  the  largest  metropolises  (38) .   These  cards  can 
and  do  contain  a  number  of  errors,  and  these  might  be  arbi- 
trarily classified  as  sampling  errors,  deliberate  errors 
(e.g.,  manufactured  data),  underreporting  errors  (i.e.,  the 
unintentional  failure  of  the  interviewee  to  report  all  of 
his  trips) ,  and  incorrectly  recorded  data  errors.   Substantial 
research  has  been  devoted  to  investigating  different  sampling 
procedures  and  sampling  errors  (18,22,47),  and  methodology 


exists  and  is  routinely  followed  for  identifying  and  cor- 
recting deliberate  errors  (43)  and  the  effects  of  under- 
reporting (53)  . 

Incorrectly  recorded  data  errors  consist   of  such  errors 
as  an  invalid  zone  number  or  wrongly  blank  data  field.   Such 
errors  can  be  attributed  to  unintentional  erroneous  disclo- 
sures by  the  interviewee,  incorrect  recording  by  the  inter- 
viewer, incorrect  transcription  by  the  coder,  and/or  failures 
in  the  human/mechanical  process  by  which  the  data  are  con- 
verted to  computer-readable  format.   As  a  practical  matter, 
many  such  errors  could  never  be  detected,  but  a  significant 
portion  of  the  most  critical  errors  could  be  located  by 
diligent  use  of  special  purpose  computer  "editing"  programs. 
It  is  generally  assumed  that  all  travel  survey  data  are 
thoroughly  edited  before  being  used  in  model  development. 

After  the  travel  survey  data  have  been  committed  to 
computer-readable  format  and  edited,  various  survey  accuracy 
checks  are  made  to  assess  the  accuracy  and  completeness  of 
both  the  dwelling  unit  and  the  trip  data.   As  a  result  of 
these  checks,  it  is  normally  necessary  to  factor  the  trip 
data  in  order  to  compensate  for  the  traditional  under- 
reporting of  trips. 

Using  these  factored  travel  survey  data  and  other  data, 
the  analyst  is  finally  able  to  calibrate  the  various  models 
used  in  transportation  planning  studies:   land  use,  trip 
generation,  trip  distribution,  modal  split,  and  traffic 
assignment.   Land  use  models  are  used  to  predict  future  land 
use,  trip  generation  models  are  used  to  estimate  the  number 
of  trips  by  purpose  which  will  originate  or  terminate  in  a 
given  geographic  area  (zone) ,  trip  distribution  models  are 
used  to  determine  how  many  of  the  trips  which  originate  in  a 
given  zone  terminate  in  each  of  the  other  zones,  modal  split 
models  determine  how  many  of  the  total  trips  are  made  by  each 
of  the  various  major  modes  (e.g. ,  auto  driver,  commercial 
bus,  etc.),  and  traffic  assignment  models  are  used  to  allocate 


the  various  interzonal  trip  movements  to  specific  routes 
and  thus  predict  the  vehicular  volume  on  each  of  the  major 
street  segments  within  the  urban  area.   Trip  generation,  trip 
distribution,  and  traffic  assignment  models  are  always  used, 
but  land  use  and  modal  split  models  frequently  are  not  used 
in  studies  involving  smaller  urban  areas. 

For  over  a  decade,  the  most  frequently  used  trip  distri- 
bution model  for  internal-internal  trips  has  been  the  gravity 
model.   Most  such  trips  have  one  end  at  the  home  and  thus  are 
treated  as  home  based  with,  by  definition,  the  production 
zone  being  the  zone  of  residence  and  the  attraction  zone 
being  the  zone  in  which  the  other  trip  end  lies.   This  results 
in  loss  of  directionality,  and  there  is  no  way  that  the 
corresponding  origin  and  destination  zones  can  be  determined 
precisely.   While  there  are  some  significant  advantages  in 
both  the  trip  generation  and  trip  distribution  models  in 
working  with  production  and  attraction  (P&A)  data  rather  than 
origin  and  destination  (O&D)  data,  there  is  a  concomitant 
disadvantage  in  that  the  output  home  based  triptables  must 
be  converted  from  P&A  format  to  O&D  format  before  they  can 
be  loaded  onto  the  network  during  the  traffic  assignment 
process.   To  do  this,  the  analyst  must  specify  the  desired 
directional  split  (e.g.,  50-50  or  55-45)  by  purpose. 


CHAPTER  II 
OBJECTIVES  AND  POTENTIAL  BENEFITS 


Objectives 
The  objectives  of  this  study  were: 

1.  To  classify  and  tabulate  all  identifiable 
incorrectly  recorded  data  errors  in  the 
unedited  trip  cards  from  a  given  study. 

2.  To  determine  the  ultimate  consequences 
of  applying  various  computer  editing 
procedures  for  locating  and  rectifying 
those  errors. 

3.  To  formulate,  based  on  the  above  results, 
recommended  guidelines  for  trip  card 
editing  procedures. 

4.  To  determine,  if  possible,  techniques 
by  which  the  occurrence  of  incorrectly 
recorded  data  errors  could  be  minimized. 

5.  To  supplement  existing  knowledge  about 
the  sensitivity  of  the  trip  generation, 
trip  distribution,  and  traffic 
assignment  models  to  input  data  errors. 

6.  To  determine  the  error  which  results  from 
assuming  a  50-50  directional  split  when 
converting  home  based  P&A  triptables  to 
O&D  triptables  for  a  relatively  small 
urban  area. 


Potential  Benefits 

An  understanding  of  the  nature  and  sources  of  incor- 
rectly recorded  data  errors  would  hopefully  lead  to  the 
development  of  techniques  for  the  minimization  of  such  errors. 

Data  on  the  number  and  type  of  incorrectly  recorded  data 
errors,  when  combined  with  similar  data  from  other  studies, 
would  provide  a  factual  basis  for  subsequent  transportation 
studies  to  use  in  evaluating  the  quality  of  work  being 
performed  by  their  interviewers,  coders,  and  keypunch  (or 
equivalent)  operators  and/or  their  equipment.   Unsatisfactory 
performance  could  then  be  corrected  before  the  resulting 
errors  invalidate  subsequent  analyses. 

Quantitative  data  would  be  available  for  estimating  the 
amount  of  error  that  would  be  introduced  by  following  differ- 
ent computer  editing  procedures.   Such  data  would  be  invalu- 
able, for  example,  if  data  errors  were  suddenly  discovered 
part  way  through  the  transportation  planning  process,  for 
the  analyst  would  have  some  valid  basis  for  estimating  the 
effects  of  the  errors. 

A  factual  basis  would  be  available  for  the  development 
of  recommended  guidelines  for  the  computer  editing  of  trip 
cards.  The  implementation  of  such  guidelines  would  insure 
that  the  data  are  "adequately"  edited  and,  perhaps,  result 
in  savings  in  time  and  money  for  those  studies  that  other- 
wise might  allocate  an  unnecessary  amount  of  resources  to 
"completely"  edit  the  data. 

Supplementary  knowledge  about  the  sensitivity  of  the 
various  transportation  planning  models  to  input  data  errors 
might  help  the  analyst  in  knowing  how  finely  to  "tune"  each 
of  the  models.   In  addition,  such  knowledge  could  assist 
researchers  in  determining  where  best  to  allocate  their 
funds  and  efforts. 

The  results  of  the  investigation  into  the  effects  of 
directional  split  when  converting  home  based  P&A  triptables 


for  a  smaller  urban  area  to  O&D  format  should  either  sub- 
stantiate the  existing  practice  of  assuming  a  50-50  direc- 
tional split  or  indicate  that  this  practice  introduces  an 
excessive  amount  of  error.   In  the  latter  case,  the  accuracy 
of  the  transportation  planning  process  could  be  improved 
although  there  would  be  a  corresponding  but  slight  increase 
in  cost. 


CHAPTER  III 
REVIEW  OF  LITERATURE 


Code  Design 

Non-quantitative  travel  survey  data,  such  as  zone 
numbers,  land  use  information,  trip  end  purposes,  sex,  type 
of  commodity,  etc. ,  are  represented  by  a  multi-digit  (one  or 
more)  field  called  a  code.   Transportation  planners  generally 
have  devoted  little  thought  to  the  selection  of  the  codes  to 
be  used  other  than  attempting  to  pick  codes  which  would 
facilitate  computer  processing. 

Myers  listed  the  characteristics  of  a  good  code  and 
classified  codes  into  five  basic  types  (33)  .   Other  authors 
have  suggested  different  code  classifications  (4,56).   Not 
surprisingly,  the  two  organizations  which  have  considered 
general  code  design  in  some  detail  are  the  Federal  Govern- 
ment (56)  and  the  Bell  Telephone  System  (4).   The  business 
industry  has  shown  some  interest  in  the  selection  of  codes 
to  represent  accounts,  but  their  interest  has  been  largely 
in  selecting  codes  which  would  allow  sufficient  expansion 
capabilities  to  avoid  the  necessity  of  assigning  a  new 
customer  an  out-of-sequence  code. 

Banking  institutions  in  particular  are  concerned  that, 
because  of  a  human  or  mechanical  error  in  entering  an 
account  number,  a  transaction  might  be  "credited"  to  the 
wrong  account.   Such  institutions  have  generally  adopted  a 
modulo-n  check  digit  as  the  suffix  to  the  normal  account 
number  (2,3,6,20,41,64).   The  value  of  the  check  digit  is 
mathematically  dependent  upon  the  values  of  the  preceeding 
digits,  so  a  surprisingly  large  number  of  errors  (e.g.,  all 
single  digit  substitutions  and  transpositions  of  two 


adjacent  digits)  can  be  readily  detected.   Mod  10  and  mod  11 
systems  are  used  the  most  frequently,  but  mod  9  and  mod  7 
systems  are  also  in  use  (50) .   Consideration  might  be  given 
to  using  a  check  digit  in  sample  numbers  and/or  survey  zone 
numbers,  although  such  use  might  not  be  justified. 

The  psychological  literature  is  full  of  research  which 
might  be  worthy  of  consideration  in  the  design  of  codes.   In 
a  classic  paper,  Yule  reported  substantial  evidence  that 
people  have  decided  preferences  for  certain  digits  in  read- 
ing the  last  digit  of  a  scale  and  in  reporting  the  last 
digit  of  their  age  (65).   Other  research  indicates  similar 
preferences  among  letters  of  the  alphabet  (25) . 

Stanhagen  investigated  coding  errors  and  developed  a 
matrix  showing  the  probability  that  any  given  character 
would  be  incorrectly  coded  for  another  given  character  (48) . 
Neisser  and  Weene  published  a  similar  matrix  showing  the 
number  of  times  a  given  hand-printed  character  was  inter- 
preted as  another  character  (34) .   Other  research  indicates 
the  highest  error  rates  occur  when  coding/copying  alphabetic 
characters,  the  lowest  occur  when  working  with  numeric 
characters,  and  error  rates  in  between  these  two  extremes 
occur  when  working  with  alpha-numeric  codes  (12,48). 

Numerous  researchers  have  considered  the  optimum  number 
of  characters  to  be  grouped  together  for  short-term  memory 
retention  such  as  for  dialing  of  telephone  numbers  or  manual 
transcription.   They  have  generally  concluded  that  groups  of 
3  or  4  characters  are  best  (10,28,45).   This  might  suggest 
that  long  survey  zone  numbers  and  sample  numbers  be  artifi- 
cially broken  into  smaller  groups  in  order  to  minimize  errors 
and  speed  coding. 

Conrad  (11)  and  Waugh  (63)  have  shown  that  errors  are 
not  randomly  distributed  among  the  various  characters  of  a 
multi-character  group  of  digits  subjected  to  short-term 
memory.   Errors  are  most  likely  in  the  next-to-last  character 
and  the  second-from-last  character  and  are  least  likely  in 
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the  first  and  last  characters  of  the  group.   This  might  sug- 
gest that  codes  (e.g.,  survey  zone  numbers)  be  designed 
which  place  the  two  most  important  digits  in  the  first  and 
last  positions,  respectively. 

Quality  Control  of  Coding  and  Digitizing  Operations 

Most  transportation  studies  have  used  rather  straight- 
forward, unsophisticated  quality  control  procedures  of  the 
coding  and  digitizing  (e.g.,  keypunching)  operations. 
Typically,  coding  quality  control  has  been  limited  to  having 
a  second  coder  dependently  check  all  or  a  portion  of  the 
interviews  simply  making  whatever  corrections  are  considered 
necessary.   Most  studies  have  a  supervisor  scan  the  forms 
for  completeness  and  continuity.   Digitizing  quality  control 
has  typically  been  limited  to  100  per  cent  key  verification 
with  the  second  operator  simply  making  whatever  corrections 
are  necessary. 

The  Bureau  of  the  Census  has  developed  and  use  much  more 
sophisticated  quality  control  procedures  (14,17,31,32,58). 
These  generally  consist  of  a  hybrid  procedure  combining 
elements  of  both  process  control  and  acceptance  sampling. 
Although  they  previously  exclusively  used  dependent  verifica- 
tion techniques,  they  now  rely  on  various  independent 
verification  methods.   This  change  came  about  after  their 
own  research  revealed  that  dependent  verification  of  coding 
failed  to  disclose  from  29  to  69  per  cent  of  the  actual 
errors,  with  an  average  of  about  50  per  cent  of  the  errors 
being  missed  (17,31,58). 

Coding 

Once  the  field  interview  forms  reach  the  office,  they 
are  "coded"  for  subsequent  conversion  to  machine-readable 
format.   This  coding  operation  entails  a  variety  of  oper- 
ations ranging  from  simple  copying  of  stated  numeric  data 
(e.g. ,  number  of  cars)  to  determination  of  a  correct  zone 
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number  or  other  classification  and  the  recording  of  an 
appropriate  identifying  code. 

The  literature  reveals  little  about  the  nature  and 
magnitude  of  errors  which  occur  specifically  during  coding, 
but  it  would  seem  reasonable  that  their  nature  is  similar  to 
that  of  digitizing  errors  as  discussed  in  the  next  section. 
It  would  also  seem  reasonable  to  assume  that  the  majority  of 
the  errors  found  during  computer  editing  are  either  coding 
errors  or  should  have  been  discovered  and  corrected  during 
coding.   In  the  one  reference  which  was  found  in  the  liter- 
ature, Fasteau  et  al.  reported  that  3.0  per  cent  of  the 
industry  and  occupation  codes  in  the  1960  Census  were 
incorrect  (17) . 

In  a  Rand  Corporation  study,  Owsowitz  and  Sweet land 
suggested  that  coding  errors  occur  randomly  (40) .   For 
example,  if  there  were  ten  possible  codes  and  a  coding  error 
were  made,  they  indicated  that  there  is  an  equal  probability 
that  any  of  the  other  nine  possible  codes  would  have  been 
coded  even  if  only  five  of  the  nine  were  permissible  entries. 
Stanhagen  subjected  this  assumption  to  extensive  research 
and  found  it  to  be  invalid.   Instead  of  being  random, 
erroneous  codes  showed  a  marked  tendency  toward  "expected 
values"  (48). 

Digitizing 

Digitizing  is  the  process  of  converting  source  documents 
to  machine-readable  format.   In  the  past,  keypunching  has 
been  almost  the  sole  method  of  digitization.   However,  new 
"data  entry"  methods  such  as  keyboard  to  tape/disc  are 
evolving  (1,5,13)  so  the  more  general  term  of  digitizing  is 
used  in  this  text. 

In  his  classic  book,  Brandon  cited  standards  for  key- 
punching speeds  of  5,000  keystrokes  per  hour  for  all  alpha- 
betic punching,  4,000  keystrokes  per  hour  for  mixed 
alphabetic/numeric,  8,000  keystrokes  per  hour  for  large 
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percentage  numeric  with  little  alpha,  and  10,000  keystrokes 
per  hour  for  all  numeric  punching  (8) .   Due  to  error  detec- 
tion stops,  Brandon  stated  that  key  verification  takes  about 
3  per  cent  longer  than  keypunching.   Significantly  higher 
and  supposedly  more  accurate  digitizing  and  verification 
speeds  are  claimed  for  some  of  the  new  keypunch  replacement 
equipment  primarily  as  the  result  of  the  elimination  of 
electro-mechanical  delays  and  the  elimination  of  individual 
"punched"  cards. 

In  comparing  digitizing  error  rates,  it  is  necessary 
to  use  errors  per  column  or  errors  per  keystroke  as  the  base 
measure,  for  the  percentage  of  cards  in  error  is  highly 
dependent  upon  the  number  of  required  keystrokes  per  card. 
In  addition,  it  is  normal  to  not  consider  errors  by  the 
machine  operator  which  the  operator  corrected.   Within  this 
context,  four  researchers  have  reported  digitizing  error 
rates  of  between  0.02  and  0.06  per  cent  of  the  total  key- 
strokes (15,21,24,29). 

Almost  all  travel  survey  data  are  key  verified  to  insure 
the  accuracy  of  the  original  digitization.   One  consultant 
has  reported  investigating  the  use  of  unverified  data,  but 
he  concluded  that  the  "...saving  in  punching  cost  is  more 
than  offset  by  cost  of  engineers  and  technicians  correcting 
the  errors"  (16) .   Key  verification  does  not  remove  all 
digitization  errors.   Hinds  (24)  found  one  per  cent  of  the 
original  errors  remained  after  verification,  and  Minton  (31) 
has  reported  from  3-14  per  cent  of  the  original  errors  in 
Census  data  remained  after  verification. 

Smith  (46)  and  Carlson  (9)  have  classified  the  errors 
which  they  found  in  data  entry  operations.   Their  results, 
which  are  very  similar,  show  that  over  50  per  cent  of  the 
errors  are  due  to  single-digit  substitution,  about  20  per 
cent  due  to  omission  of  a  digit,  about  6  per  cent  to  inser- 
tion of  a  digit,  and  1.5-7.3  per  cent  due  to  transposition 
of  adjacent  digits. 
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The  in-depth  study  by  Deming  et  al.  (15)  revealed  sev- 
eral other  items  of  interest.   First,  they  found  keypunching 
error  rates  increase  from  left-to-right  across  the  card,  thus 
indicating  that  the  most  important  data  should  be  placed  as 
far  left  as  possible.   Second,  they  found  that,  within  a 
given  data  field,  "unusual"  entries  are  more  likely  to  be  in 
error  than  "usual"  entries.   Third,  given  that  a  keypunch 
error  has  occurred,  the  erroneous  result  tends  toward  an 
expected  value. 

There  are  some  significant  ramifications  to  some  of  the 
new  digitizing  equipment  currently  on  the  market.   In  some 
of  the  equipment,  there  is  actual  computer  interface  between 
the  keyboard  operator  and  the  storage  media  receiving  the 
data.   Thus,  a  certain  amount  of  computer  editing  can  be  done 
concurrently  with  data  entry.   It  has  previously  been  sug- 
gested that  keypunch  operators  could  do  much  of  the  manual 
editing  (27),  and  this  new  technology  makes  such  a  procedure 
highly  desirable. 

Computer  Editing 

The  use  of  computer  editing  to  locate  incorrectly 
recorded  data  errors  in  travel  survey  data  has  long  been 
advocated  by  the  Federal  Highway  Administration: 

"Editing  trip  records  —  In  any  analysis 
that  uses  the  survey  data,  it  is  always 
necessary  to  edit  the  information  to  insure 
that  the  items  of  data  have  been  correctly 
coded  and  punched.   This  is  particularly 
important  where  a  great  deal  of  time  and 
money  is  to  be  expended  in  the  processing 
and  analysis  of  these  data.   If  the  source 
data  are  not  rigidly  controlled  and  edited, 
it  is  possible  that  much  useless  analysis 
will  result.   To  avoid  these  costly  problems 
and  to  permit  smooth  processing  in  later 
phases  of  model  development,  all  trip  records 
should  be  edited  using  an  appropriate  edit 
program."  (52,  p.  IV-3) 

"In  performing  contingency  checks  on  the 
basic  trip  file,  all  of  the  information,  or 
fields,  that  will  be  used  in  subsequent 
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programs  should  be  checked  for  illegal  codes, 
illogical  information,  etc."  (55,  p.  III-4) 

Although  all  transportation  studies  have  undoubtedly 
done  some  editing  of  their  travel  survey  data,  most  such 
efforts  have  gone  undocumented  so  very  little  is  known  con- 
cerning the  thoroughness  of  editing  and  the  quantity  and 
types  of  errors  found.   One  probable  reason  for  this  lack 
of  documentation  is  a  natural  reluctance  to  admit  that  errors 
were  made  and  to  reveal  how  they  were  "fixed  up"  (57). 

Some  data  on  editing  is  available  in  technical  memo- 
randums and  other  hard-to-come-by  documents.   For  example, 
the  Indianapolis  transportation  study  reported  error  rates 
(per  cent  of  cards  in  error)  of  8.4,  4.9,  14.5,  and  3.5  per 
cent  on  their  dwelling  unit  summary  cards,  internal  trip 
report  cards,  external  survey  cards,  and  truck/taxi  survey 
cards,  respectively  (26).   Based  on  similar  data  from  a  few 
other  studies,  the  results  of  a  questionnaire,  and  personal 
contact  with  a  number  of  practitioneers,  it  can  be  concluded 
that  it  is  not  uncommon  for  those  studies  which  conduct 
extensive  computer  edits  to  locate  one  or  more  errors  in 
over  30  per  cent  of  the  dwelling  unit  samples,  15-20  per  cent 
of  the  external  survey  data  cards,  and  about  15  per  cent  of 
the  truck/taxi  data.   It  should  be  borne  in  mind,  however, 
that  the  percentage  of  errors  found  in  travel  survey  data 
fluctuates  greatly  depending  upon  the  quantity  of  data 
contained  in  an  individual  record,  the  quality  of  the 
preceeding  interviewing,  coding,  and  digitizing  processes, 
the  number  and  extent  of  previous  manual  edits,  and  upon 
the  number  and  types  of  computer  edits  to  which  the  data  are 
subjected. 

There  is  some  reason  to  suspect  that  less  stringent 
editing  than  that  normally  assumed  necessary  might  be  suf- 
ficient.  In  a  statistical  investigation  involving  the  1953 
Industrial  Census  in  Norway,  Nordbotten  concluded  that  a 
small  number  of  edit  checks  would  suffice  and  that  little 
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would  be  gained  by  more  thorough  editing  (36) .   Based  on  an 
extension  of  such  results,  the  U.S.  Bureau  of  the  Census  and 
some  other  organizations  have  generally  adopted  procedures 
whereby  they  automatically  either  "impute"  another  value  to 
replace  one  which  an  edit  check  shows  to  be  in  error  or  else 
substitute   a  similar  record  (or  portion  thereof)  for  the 
record  containing  the  data  in  error  (19,35,37,39,42,49,62). 
They  note,  however,  that  "the  development  has  not  yet 
reached  the  point,  nor  is  it  likely  to  do  so  in  the  immedi- 
ate future,  of  using  the  computer  as  a  substitute  for  a 
trained  economic  statistician  in  handling  returns  that  show 
important  problems , ••• "   (62).   At  least  one  major  transporta- 
tion study  in  the  United  States  has  used  such  techniques  in 
furnishing  missing  travel  survey  data  items  and  perhaps  for 
replacing  invalid  data  detected  through  edits.* 

Directional  Split  Assumptions 

During  the  early  sixties,  transportation  studies  using 

the  gravity  model  method  of  trip  distribution  normally 

computed  areawide  average  directional  splits  for  each  home 

based  trip  purpose  and  applied  those  percentages  when 

converting  the  triptables  from  P&A  format  to  O&D  format. 

The  impetus  for  the  above  procedure  was  provided  by  the 

Bureau  of  Public  Roads  as  evidenced  by  the  following 

quotation: 

"These  (directional  split)  percentages 
must  be  either  assumed  or  preferably  arrived 
at  by  an  analysis  of  the  origin-destination 
survey  data.   This  analysis  is  rather  simple 
in  theory  but  it  requires  a  considerable 
amount  of  time.   Essentially  all  of  the  trip 
records  for  each  trip  purpose  or  travel  mode 
category  are  sorted  to  determine  how  many  of 
them  are  produced  by  all  zones  of  origin  and 
how  many  by  all  zones  of  destination.   The 
required  percentages  are  then  calculated 
directly.   These  percentages,  which  represent 


*This  transportation  study  refused  permission  to  be 
cited  in  a  bibliographical  reference. 
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areawide  averages,  are  then  applied  to  each 
zone  •-.."  (51,  p.  VI-3) 

About  1964,  some  inhouse  (and  undocumented)  research 

was  conducted  by  Marks  of  the  Bureau  of  Public  Roads  (30) . 

Working  with  1955  Washington,  D.C.  origin-destination  data, 

Marks  concluded  that  there  was  no  significant  difference 

between  using  an  assumed  50-50  directional  split  and  the 

computed  percentage.   Based  on  this  research,  the  Bureau 

changed  its  unwritten  policy  to  accepting  assumed  50-50 

directional  splits,  and  this  procedure  has  been  used  almost 

universally  since  that  time.   This  change  in  thinking  was 

reflected  in  the  wording  of  the  1965  gravity  model  manual: 

"In  current  practice  a  50-50  split  is 
usually  assumed  for  all  home  based  trips." 
(52,  p.  VI-2) 

Sensitivity  of  Travel  Models  to  Errors 

Relatively  little  of  a  quantitative  nature  is  known 
above  the  sensitivity  of  the  transportation  planning  process 
as  a  whole  to  differences  in  input  data  and  the  manner  in 
which  such  differences  or  errors  are  propagated  from  model 
to  model  (e.g.,  from  trip  generation  to  trip  distribution  to 
traffic  assignment) .   The  only  two  substantive  studies  in 
this  area  are  the  CONSAD  report  (54)  and  NCHRP  Report  120 
(23). 

Both  reports  agree  that  the  traffic  assignment  model  is 
the  least  sensitive  of  all  the  models  to  differences  in  input 
data.   The  CONSAD  report  further  states  that  the  most  sensi- 
tive of  the  models  is  the  trip  distribution  model  in  which 
"small  errors  •••   were  greatly  magnified  by  the  operations 
performed  within  that  module." 

An  astute  observation  was  made  in  the  NCHRP  report: 

"An  early  conclusion  was  that  data  obtained 
by  survey  affect  the  strategic  plan  only 
indirectly.   Forecasts  intervene  between  the 
plan  and  the  data  at  every  step.   Under  these 
circumstances,  plans  are  based  directly  on 
forecasts,  not  on  data.   And  the  errors  of 
forecasts  are  apt  to  be  much  greater  than 
errors  of  data." 
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CHAPTER  IV 
BACKGROUND  INFORMATION  AND  INITIAL  DECISIONS 


Study  Area  and  Computer  Battery  Selections 

In  the  selection  of  a  study  area  for  research  purposes, 
prime  consideration  had  to  be  given  to  the  following: 

1.  Availability  of  unedited  travel  survey 
data. 

2.  Availability  of  original  interview  and 
coding  forms. 

3.  Availability  of  volume  and  classification 
counts  at  the  screenlines  and  at  the 
cordon  line. 

4.  Availability  of  a  coded  traffic 
assignment  network. 

5.  Availability  of  necessary  independent 
variable  data  for  trip  generation 
analyses. 

6.  Expected  cooperativeness  of  the  local 
people  and/or  consultant  who  were 
actually  involved  in  the  original 
study. 

On  the  basis  of  the  above  considerations,  Evansville, 
Indiana,  was  chosen  for  both  the  investigation  of  editing 
procedures  and  for  the  investigation  of  the  effects  of 
directional  split  assumptions.   In  addition,  it  was  decided 
to  utilize  unedited  travel  survey  data  from  the  Lafayette, 
Indiana,  study  to  provide  a  supplementary  source  of 
information  about  incorrectly  recorded  data  error  rates. 

For  various  reasons,  it  was  decided  to  use  the  Federal 
Highway  Administration's  IBM  360  battery  of  urban  transpor- 
tation planning  computer  programs  (60,61).   The  computer 
used  was  the  Indiana  State  Highway  Commission's  IBM  360 
Model  50  at  their  headquarters  office  in  Indianapolis. 
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The  Evansville  Regional  Area  Development 
and  Transportation  Study 

Evansville  is  located  in  southwestern  Indiana  bordering 
the  Ohio  River  (the  internal  study  area  lies  entirely  on  the 
Indiana  side  of  the  river) .   A  population  of  approximately 
175,000  people  reside  within  the  cordon  line  in  a  total  of 
292  survey  zones.   Bus  ridership  was  negligible  at  the  time 
of  the  surveys. 

Vogt,  Sage  and  Pflum,  the  consultant  doing  the  transpor- 
tation study,  collected  the  travel  survey  data  between 
October  of  1969  and  June  of  1970.   Reductions  of  the 
interviewing/coding  forms  used  in  the  various  surveys  are 
included  in  Appendix  A  and  examples  of  the  geographic  coding 
indices  are  included  in  Appendix  B. 

In  the  actual  study,  there  is  a  large  area  called  the 
COG  (Council  of  Government)  area  lying  between  the  cordon 
line  referred  to  above  and  an  "outer  cordon. "   While  the 
consultant  is  obligated  to  do  some  special  analyses  within 
this  area,  no  home  interview  data  were  collected  from  this 
area  and  all  external  survey  interview  stations  were  located 
at  the  "inner"  cordon.   Therefore,  this  COG  area  has  been 
treated  as  any  other  external  area  for  the  purposes  of  this 
research  project. 

The  consultant  conducted  a  special  roadside  interview 
study  at  Dress  Memorial  Airport.   However,  neither  this 
research  project  nor  the  consultant  utilized  this  additional 
travel  survey  information  to  supplement  the  data  available 
from  the  other  surveys. 

Unfortunately,  extremely  poor  weather  existed  during  a 
substantial  portion  of  the  period  during  which  the  home 
interview  surveys  were  conducted  (they  commenced  in  January 
and  ran  through  the  first  week  of  April,  1970) .   As  a 
result,  many  residents  held  their  travel  to  a  minimum  and 
thus  the  average  daily  traffic  (ADT)  during  the  survey 
period  was  significantly  less  than  the  annual  average  daily 
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traffic  (AADT) .   In  addition,  there  were  major  problems 
obtaining  accurate  volume  counts  during  the  period  due 
primarily  to  the  failure  of  the  mechanical  equipment  to 
perform  properly. 

In  the  10  per  cent  home  interview  sample,  trip  logs 
were  sent  in  advance  to  the  selected  dwelling  units.   After 
the  travel  date,  the  interviewer  went  and  obtained  all 
required  information.   Both  the  interviewer  and  the  inter- 
viewer's supervisor  checked  the  finished  forms  for  com- 
pleteness.  They  were  then  coded  for  keypunching,  and  the 
coding  was  subsequently  checked  dependently  by  another  coder. 
Finally,  a  supervisor  checked  the  coded  forms  for  completeness 
and  continuity.   There  was  a  telephone  call-back  procedure 
for  verifying  the  data  turned  in  by  the  interviewers,  and 
various  statistics  (e.g.,  trips  per  dwelling  unit)  were  kept 
on  the  data  submitted  by  the  individual  interviewers  for 
comparison  against  standards.   As  a  result  of  these  checks, 
two  interviewers  were  fired. 

External  survey  data  were  coded  on  mark  sense  forms  for 
digitization  by  the  Indiana  State  Highway  Commission  mark 
sense  reader,  and  the  same  forms  were  used  for  field  inter- 
viewing.  The  field  supervisor  spot  checked  the  completed 
forms  as  they  were  submitted,  and  they  were  subsequently 
coded  in  the  office.   Finally,  the  coded  forms  were  checked 
by  a  supervisor. 

In  the  100  per  cent  sample  taxi  survey,  one  interviewer 
copied  all  of  the  interview  data  from  the  vehicle  logs  and 
then  coded  the  data.   The  same  tasks  were  performed  by 
several  different  interviewers  in  the  20  per  cent  truck 
sample,  with  different  persons  coding  the  data  than  collected 
it.   After  coding,  the  forms  from  both  surveys  were  checked 
dependently  by  other  coders.   Finally,  a  supervisor  checked 
the  forms  for  completeness  and  continuity. 
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The  Greater  Lafayette  Area  Transportation 
and  Development  Study 

The  twin  cities  of  Lafayette  and  West  Lafayette  are 
located  in  north  central  Indiana  about  one  third  of  the  way 
from  Indianapolis  to  Chicago.   A  population  of  approximately 
90,000  people  reside  within  the  cordon  line  in  a  total  of 
141  zones  (1,611  subzones) .   Transit  ridership  is  minimal. 

Local  personnel  administered  the  collection  of  the 
travel  survey  data,  and  the  surveys  were  conducted  between 
September  of  1970  and  January  of  1971.   Reductions  of  the 
interviewing/coding  forms  used  in  the  various  surveys  are 
included  in  Appendix  A  and  examples  of  the  geographic  coding 
indices  are  included  in  Appendix  B. 

In  the  12.5  per  cent  home  interview  sample,  trip  logs 
were  delivered  in  advance  to  the  selected  dwelling  units. 
After  the  travel  date,  the  interviewer  returned  and  obtained 
all  required  information.   Both  the  interviewer  and  the 
interviewer's  supervisor  checked  the  finished  forms  for 
completeness.   They  were  then  coded  for  keypunching,  and  the 
coding  was  subsequently  checked  dependently  by  another  coder. 
There  was  a  10  per  cent  telephone  call-back  procedure  for 
verifying  the  data  turned  in  by  the  interviewers,  and 
statistics  were  kept  on  trips  per  auto  and  trips  per  dwelling 
unit  as  obtained  by  each  interviewer  for  comparison  against 
standards. 

Special  field  interview  forms  were  used  at  the  roadside 
interview  stations,  and  the  field  supervisor  spot  checked 
the  completed  forms  as  they  were  submitted.   The  trip  end 
addresses  were  coded  to  zones  directly  on  the  interview 
form,  and  each  of  these  codes  was  dependently  checked  by 
another  coder.   All  of  the  data  from  the  interview  forms  was 
then  coded  onto  mark  sense  forms  for  digitization  by  the 
Indiana  State  Highway  Commission  mark  sense  reader.   A 
random  sample  of  the  completed  scan  sheets  was  checked 
dependently  by  another  coder. 


21 


In  the  taxi  survey,  one  interviewer  copied  all  of  the 
interview  data  from  the  vehicle  logs  and  then  a  second 
person  coded  the  data.   The  same  tasks  were  performed  by- 
several  different  interviewers  and  several  different  coders 
in  the  truck  survey.   All  coding  was  subsequently  checked 
dependently  by  another  coder,  and  a  telephone  call-back 
procedure  was  used  for  verifying  the  data  submitted  by  the 
interviewers  in  the  truck  survey. 

Selection  of  Computer  Editing  Procedures 

There  are  two  steps  to  a  trip  card  computer  editing 
procedure,  error  checking  and  error  rectification.   In  the 
first  step,  the  data  are  subjected  to  programmed  tests  to 
insure  their  validity  and  reasonableness.   These  tests  can 
be  individually  categorized  as  either  single  field  or 
consistency.   In  the  single  field  tests,  each  data  field  is 
examined  by  itself  without  regard  to  the  content  of  any  other 
data  field  in  an  attempt  to  determine  whether  its  contents 
are  possibly  valid. 

In  the  consistency  checks,  the  contents  of  a  given  data 
field  are  considered  in  relation  to  the  contents  of  one  or 
more  other  data  fields  (in  the  same  trip  card  or  in  a  related 
trip  card)  to  insure  that  they  are  "consistent."   Thus,  a 
trip  start  time  of  9  AM  and  a  corresponding  trip  arrive  time 
of  8  AM  would  both  pass  the  single  field  checks  but  would  be 
rejected  by  a  consistency  check  comparing  trip  start  time  to 
arrive  time.   Consistency  checks  are  much  more  difficult  to 
program  than  single  field  checks,  so  there  is  a  tendency 
toward  making  only  a  minimal  number  of  consistency  checks. 
Frequently,  there  are  many  error  checks  which  could  be  made 
which  would  locate  some  errors  but  would  also  needlessly 
reject  valid  data  (e.g.,  a  test  for  an  "unreasonably"  long 
trip  in  terms  of  time  might  needlessly  reject  a  long 
distance  trip  made  over  extremely  icy  roads) .   Those  entries 
marked  by  an  asterisk  in  Table  C3  fall  in  this  category. 
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Because  of  the  difficulty  of  handling  correct  data  which  are 
needlessly  rejected,  these  potential  error  checks  are 
frequently  not  made. 

The  second  step  of  an  edit  procedure  is  error  recti- 
fication.  Given  that  there  is  an  error,  what  do  you  do  next? 
There  are  three  major  alternatives:   correct  the  data  in 
error,  discard  the  data  in  error,  or  arbitrarily  replace  the 
bad  data  with  some  "good"  data.   It  is  usually  expected  that 
the  most  accurate  results  are  obtained  by  correcting  the  data 
in  error.   Normally,  this  can  be  done  by  simply  referring 
back  to  the  original  interview  data  and  perhaps  making  some 
"reasonable"  assumptions.   In  some  cases,  this  might 
necessitate  a  return  call  to  the  original  interviewees. 

If  it  is  decided  to  discard  the  data,  there  are  several 
significant  questions  to  be  considered.   Trip  data  from  the 
home  interview,  truck,  and  taxi  surveys  are  cluster  sampled, 
and  there  are  ramifications  relative  to  retaining  a  repre- 
sentative sample  if  some  or  all  of  such  data  are  discarded. 
If  the  whole  sample  is  not  discarded  in  its  entirity, 
appropriate  "correction"  factors  would  have  to  be  applied  to 
the  corresponding  dwelling  unit  expansion  factors  in  deriving 
the  trip  expansion  factors  to  account  for  the  fact  that  some 
individual  trips  (or  all  trips  made  by  some  persons)  have 
been  discarded.   These  particular  considerations  are  not 
relevant  for  external  survey  data  where  it  is  the  trip  itself 
that  is  being  sampled  rather  than  the  dwelling  unit  or 
vehicle,  but  one  still  must  be  concerned  about  maintaining 
a  representative  number  of  interviews  by  hour  by  direction 
by  vehicle  type  at  the  given  station. 

The  third  alternative,  that  of  arbitrarily  replacing 
the  bad  data  with  "good"  data,  is  the  technique  which,  as 
described  in  the  Review  of  Literature,  is  being  used  more 
and  more  by  census  organizations.   With  the  exception  of  the 
one  mentioned  transportation  study,  there  have  been  no  known 
efforts  to  implement  such  methodology  in  the  computer 
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processing  of  travel  survey  data.   However,  it  would  not  be 
surprising  to  find  that  such  techniques  have  been  manually 
applied  in  scattered  instances  by  some  studies  when  it  was 
considered  expedient  to  do  so. 

For  purposes  of  this  research  project,  three  editing 
procedures  were  investigated.   Procedure  A  consisted  of  using 
all  of  the  travel  survey  data  without  editing  it  at  all. 
While  this  procedure  eliminated  the  entire  editing  phase,  it 
was  recognized  that  there  would  be  substantial  difficulties 
in  getting  some  of  the  computer  programs  to  work  satis- 
factorily with  input  data  containing  such  things  as  blank  or 
invalid  zone  numbers. 

Procedure  C  consisted  of  making  a  quite  sizable  number 
of  error  checks  (see  the  lists  of  error  checks  in  Appendix 
C) ,  correcting  the  errors  if  at  all  possible  using 
"reasonable"  judgement  when  necessary,  and  discarding  in  their 
entirity  those  samples  containing  data  errors  which  could  not 
be  corrected.   Assuming  that  the  number  of  samples  which 
would  have  to  be  discarded  is  minimal  (this  turned  out  to  be 
the  case) ,  it  was  believed  that  one  could  a  priori  say  that 
procedure  C  is  the  best  editing  procedure  in  terms  of  getting 
the  most  accurate  results. 

There  was  one  deviation  from  the  stated  method  of 
correcting  all  data  in  error,  if  possible.   After  investi- 
gation into  the  effects  of  simply  deleting  external  survey 
trip  cards  containing  errors  as  contrasted  to  correcting 
the  errors,  it  was  determined  that  the  resulting  triptable 
matrices  would  show  little  difference  and  thus  the  external 
survey  trip  cards  containing  errors  were  simply  deleted  (it 
should  be  remembered  that  the  external  survey  is  the  only 
survey  in  which  the  trip  data  are  not  cluster  sampled) . 

Procedure  B  data  were  derived  by  making  a  quite  sizable 
number  of  error  checks  (the  same  error  checks  used  in 
procedure  C)  and  simply  discarding  in  their  entirity  all 
samples  containing  errors.   The  expansion  factors  for  the 
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remaining  data  were  obtained  by  simply  factoring  the  expan- 
sion factors  derived  in  the  normal  manner  for  the  entire 
survey  data  (i.e.,  procedure  A  data)  by  the  ratio  of  the 
original  number  of  samples  to  the  remaining  number  of  samples. 

In  both  procedure  B  and  procedure  C,  error  checks  were 
made  on  all  data  fields,  not  just  those  containing  data  used 
in  the  planning  models  investigated  in  this  research  project. 
Similarly,  data  records  in  error  were  considered  correctable 
only  if  all  data  fields  containing  errors  could  be  corrected. 
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CHAPTER  V 
DATA  EDITING,  EXPANSION,  AND  FACTORING 


Data  Editing 

The  basic  data  available  to  this  project  consisted  of 
punched  but  unedited  travel  survey  data  cards  from  both 
Evansville  and  Lafayette.   The  data  from  the  home  interview 
and  truck/taxi  surveys  had  been  keypunched  and  key  verified, 
while  the  data  from  the  external  surveys  had  been  punched  by 
the  Indiana  State  Highway  Commission  mark  sense  reader  from 
the  mark  sense  forms  coded  by  the  two  coding  staffs.   In 
order  to  successfully  process  the  scan  sheets,  it  was 
necessary  to  "fix-up"  about  6  per  cent  of  the  Evansville 
forms  and  about  2  per  cent  of  the  Lafayette  forms.   Such 
action  was  necessary  in  order  to  replace  mutilated  forms, 
eliminate  handwriting  that  carelessly  protruded  into  the 
coding  areas,  or  to  completely  erase  coding  marks  which  the 
coder  had  subsequently  changed  but  failed  to  erase  properly. 

All  of  the  travel  survey  data  from  both  studies  were 
subjected  to  a  fairly  comprehensive  error  checking  process 
on  the  computer.   The  error  checks  used  and  the  results  of 
their  application  are  documented  in  Appendix  C  with  some  of 
the  more  significant  data  summarized  in  Table  1.   As 
previously  planned,  the  analysis  for  this  research  from  this 
point  to  completion  made  use  of  the  Evansville  data 
exclusively. 

Effective  with  the  completion  of  the  error  checking 
process,  data  corresponding  to  procedures  A  and  B  as  dis- 
cussed in  the  last  section  of  Chapter  IV  were  available. 
Using  "reasonable"  judgement  where  necessary,  the  travel 
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survey  data  in  error  were  corrected  and  combined  with  the 
"clean"  records  of  procedure  B,  thus  creating  a  set  of 
edited  travel  survey  data  referred  to  as  procedure  C  data  in 
this  project. 

The  travel  survey  data  were  not  linked. 

Development  of  Expansion  Factors 

To  account  for  the  fact  that  travel  survey  data  are 
collected  for  only  a  portion  of  the  dwelling  units,  vehicles 
crossing  the  cordon  line,  trucks,  and  taxis,  it  is  necessary 
to  compute  expansion  factors  for  the  trip  data  in  order  to 
obtain  a  representation  of  all  the  trips  made  in  the  study 
area.   There  are  standard  equations  for  computing  these 
expansion  factors  (53). 

These  standard  procedures  were  used  in  computing  the 
expansion  factors  for  the  procedure  A  and  procedure  B 
external  survey  data  (as  explained  in  Chapter  IV,  procedure 
C  external  survey  data  were  the  same  as  procedure  B  external 
survey  data)  and  for  the  procedure  A  and  procedure  C  truck  and 
taxi  survey  data.   However,  a  special  formula  was  used  for 
computing  the  dwelling  unit  expansion  factors  for  procedure 
A  and  procedure  C  data  (the  formula  used  was  developed  by 
the  consultant  performing  the  Evansville  study  and  approved 
by  the  Federal  Highway  Administration) : 

FACTOR  -  10 (B+S-C-E) 
tALiUK    B+S+X-C-D 


where: 


B  =  Number  of  samples  initially  selected 

C  =  Number  of  samples  vacant,  demolished, 
or  commercial 

D  =  Number  of  samples  which  were  refusals, 
no  one  home,  or  sickness 

S  =  Number  of  "supplemental"  samples  (these 
were  taken  when  the  sample  address 
contained  more  households  than  expected) 

E  =  Number  of  "change  of  address"  samples 
(these  were  taken  when  the  sample 
address  did  not  exist  in  the  field) . 
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X  =  Number  of  "extra"  samples  (these  were 
taken  when  the  number  of  samples 
selected  in  a  given  zone  were  less 
than  the  desired  one  in  ten) 

Expansion  factors  for  the  Procedure  B  home  interview, 
truck,  and  taxi  survey  data  were  computed  by  multiplying 
the  corresponding  procedure  A  expansion  factor  by  the  ratio 
of  the  original  number  of  samples  to  the  number  of  samples 
remaining  after  rejecting  all  samples  containing  errors. 

Survey  Accuracy  Checks 

After  expansion  factors  have  been  computed  and  inserted 
in  the  trip  cards,  it  is  necessary  to  make  various  standard 
comparisons  of  the  expanded  data  from  the  travel  surveys  and 
corresponding  estimates  derived  from  independent  data 
sources.   These  comparisons  are  called  survey  accuracy  checks, 
and  are  normally  grouped  into  dwelling  unit  comparisons  and 
trip  data  comparisons. 

For  all  practical  purposes,  identical  estimates  of 
population,  total  dwelling  units,  etc.  were  obtained  from 
the  expanded  procedure  A  and  procedure  C  data.   The  estimates 
obtained  from  the  procedure  B  data  did  differ  from  those 
obtained  from  the  other  procedures  (see  Table  2) ,  but  most 
analysts  would  have  accepted  the  results  from  any  of  the 
three  procedures. 

Table  2.   Dwelling  Unit  Characteristics  As  Related  To 
Editing  Procedure 


Statistic 


Persons  per  dwelling  unit 

Cars  per  dwelling  unit 

Average  income  per  dwelling  unit 


Procedure 


3.04 

1.36 

$7,810 


B 


2.76 

1.24 

$7,180 


3.02 

1.37 

$7,820 


Significant  problems  with  the  procedure  B  data  became 
obvious  when  the  trip  data  comparisons  were  made.   For 
example,  the  first  work  trip  comparison  was  not  good  for 
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procedure  B  data  while  it  was  completely  satisfactory  for 
the  data  from  procedures  £  and  C.   The  major  problem,  how- 
ever, was  that  the  composition  of  traffic  showed  way  too 
many  truck  trips  and  way  too  few  automobile  trips  (this 
occurred  because  of  the  characteristics  of  the  rejected 
travel  survey  data  as  summarized  in  Table  1) . 

The  primary  trip  data  comparison  is  the  screenline 
crossing  analysis  in  which  the  number  of  survey  trips 
crossing  arbitrary  lines  called  screenlines  bisecting  the 
area  is  compared  against  actual  ground  counts  at  the  various 
locations.   There  were  four  screenlines  in  Evansville. 

A  tabulation  of  survey  screenline  crossings  vs.  ground 
counts  by  procedure  is  presented  in  Table  3.   Each  entry 
represents  the  sum  of  all  crossings  for  all  four  screenlines 
for  all  vehicles  for  the  given  hour.   The  tabulated  survey 
crossing  data  have  been  factored  to  reflect  the  effect  of 
double  crossings,  and  the  estimated  time  of  screenline 
crossing  has  been  based  on  trip  start  and  arrive  time  and 
the  relative  distance  of  each  of  the  two  zones  from  the 
screenline. 

After  examining  the  results  of  the  survey  accuracy 
checks,  a  prudent  analyst  would  probably  have  questioned 
proceeding  using  the  procedure  B  data.   However,  it  was 
decided  that,  in  fulfilling  the  objectives  of  this  research 
project,  it  would  be  advantageous  to  continue  using  the 
data. 

Development  and  Application  of  Trip  Factors 

It  has  long  been  recognized  that  there  is  substantial 
underreporting   of  non-work  trips,  and  it  is  customary  to 
factor  the  expanded  trip  data  to  compensate  for  the  missing 
trips.   Using  prudent,  justifiable  logic,  factors  are 
developed  by  trip  purpose  with  the  objective  of  attempting 
to  approximate  the  actual  ground  counts  at  the  screenlines. 


Table  3.   Total  Screenline  Crossings  As  Obtained  From 
Expanded  But  Unfactored  Survey  Data 
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Procedure 

Ground 

Hour 

A 

B 

C 

Count 

6-7  AM 

18,586 

17,771 

18,351 

18,476 

7-8 

33,426 

30,349 

33,600 

34,338 

8-9 

20,347 

17,717 

20,283 

21,700 

9-10 

15,538 

14,671 

15,524 

19,740 

10-11 

16,286 

16,868 

16,311 

21,416 

11  -  NOON 

16,995 

15,648 

16,889 

23,838 

NOON  -  1  PM 

19,936 

16,100 

19,948 

25,152 

1-2 

17,882 

17,371 

17,984 

24,302 

2-3 

22,458 

21,132 

22,582 

27,804 

3-4 

32,783 

30,591 

32,636 

38,814 

4-5 

33,469 

30,960 

33,622 

38,572 

5-6 

32,173 

29,819 

32,361 

34,596 

6-7 

18,860 

16,300 

18,877 

24,756 

7-8 

14,252 

11,450 

14,390 

18,918 

8-9 

10,989 

9,660 

10,985 

15,698 

9-10  PM 

10,062 

8,328 

10,196 

13,758 

TOTAL 

334,042 

304,735 

334,539 

401,878 

Table  4.   Trip  Factors  To  Compensate  For  Underreporting 


Hours 

Procedure 

Trip 
Purpose 

A 

B 

C 

HBW 

all  hours 

1.00 

1.10 

1.00 

HBS  &  HBO 

6-8  AM 
rest  of  day 

1.00 
1.34 

1.00 
1.80 

1.00 
1.34 

NHB 

6-8  AM 
rest  of  day 

1.00 
1.53 

1.00 
2.10 

1.00 
1.53 

Trucks 

6  AM  -  3  PM 
rest  of  day 

1.31 
1.96 

1.00 
1.00 

1.31 
1.96 

All  Others 

all  hours 

1.00 

1.00 

1.00 
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Such  an  approach  was  followed  in  this  research  project, 
and  the  resulting  factors  by  purpose  are  given  in  Table  4. 
These  factors  were  based  on  an  assumption  that  people  tend  to 
remember  their  first  trip  of  the  day  better  than  subsequent 
trips  and  that,  due  to  the  high  sampling  rate,  external  and 
taxi  trips  could  not  be  justifiably  factored.   Because  the 
trip  data  from  procedures  A  and  C  as  tabulated  in  Table  3 
were,  for  all  practical  purposes,  identical,  it  was  felt 
that  the  use  of  different  factors  for  those  two  sets  of  data 
might  introduce  differences  not  really  attributable  to  the 
actual  differences  in  the  original  data.   Therefore,  as 
shown  in  Table  4 ,  the  same  factors  were  used  for  both 
procedures  A  and  C. 

The  effects  on  the  screenline  crossings  of  applying 
these  factors  are  shown  in  Table  5.   The  resulting  number  of 
trips,  as  compared  to  the  ground  counts,  is  perhaps  slightly 
higher  than  one  would  like,  but  since  all  three  procedures 
gave  results  which  were  slightly  high,  it  was  felt  that  these 
factored  data  would  be  suitable  for  their  intended  use  in 
this  research  project. 

Table  6  shows  the  composition  of  total  vehicle  trips  by 
mode  by  procedure  and  the  effect  of  the  factoring  process  on 
the  composition. 
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Table  5.   Total  Screenline  Crossings  As  Obtained  From 
Expanded  And  Factored  Survey  Data 


Procedure 

Ground 

Hour 

A 

B 

C 

Count 

6-7  AM 

18,752 

18,672 

18,517 

18,476 

7-8 

34,220 

32,199 

34,396 

34,338 

8-9 

23,874 

22,116 

23,799 

21,700 

9-10 

18,975 

19,589 

18,966 

19,740 

10-11 

20,398 

23,020 

20,444 

21,416 

11  -  NOON 

21,833 

22,677 

21,719 

23,838 

NOON  -  1  PM 

25,784 

24,046 

25,791 

25,152 

1-2 

22,616 

24,512 

22,721 

24,302 

2-3 

27,876 

29,499 

28,057 

27,804 

3-4 

40,584 

41,748 

40,463 

38,814 

4-5 

40,928 

40,621 

41,103 

38,572 

5-6 

38,007 

39,454 

38,221 

34,596 

6-7 

23,247 

23,710 

23,285 

24,756 

7-8 

18,303 

17,638 

18,477 

18,918 

8-9 

13,956 

14,836 

13,937 

15,698 

9-10  PM 

12,522 

12,361 

12,670 

13,758 

TOTAL 

401,875 

406,698 

402,566 

401,878 
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CHAPTER  VI 
TRAFFIC  ASSIGNMENT 


O&D  Triptables 

The  basic  input  to  the  traffic  assignment  process  is 
total  purpose  triptables,  supposedly  in  O&D  format,  to  be 
subsequently  "loaded"  onto  the  network.   Using  the  data  from 
each  of  the  three  procedures  being  investigated,  three  total 
purpose,  O&D,  ADT,  vehicle  triptables  were  built. 

Since  it  was  a  priori  stated  that  procedure  C  is  the 
best  in  terms  of  accuracy  of  results,  the  triptables  corre- 
sponding to  procedures  A  and  B  were  compared  to  that  corre- 
sponding to  procedure  C.   The  results  of  these  comparisons 
are  given  in  Tables  7  and  8,  respectively.   These  comparisons 
gave  per  cent  root  mean  square  error  rates,  the  standard 
measure  of  error  used  in  transportation  planning  comparisons 
of  triptables  and  network  loads,  of  14.89  and  111.02, 
respectively. 

Network  Loads 

The  Evansville  traffic  assignment  network  consists  of 
314  zones  (292  internal  zones  and  22  external  stations) , 
3943  directional  links,  and  a  last  node  number  of  2110. 
Each  of  the  O&D  triptables  corresponding  to  procedures  A,  B, 
and  C  were  loaded  onto  the  network  using  vines  built  on  the 
basis  of  observed  travel  time.   The  results  of  the 
unrestrained  loading  process  are  tabulated  in  Table  9,  and 
the  directional  link  loads  are  compared  in  Tables  10  and  11. 

The  per  cent  root  mean  square  error  of  load  A  compared 
to  load  C  is  1.53  with  a  maximum  directional  ADT  volume 
difference  of  216.   The  corresponding  statistics  for  load  B 
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Table  9.   Network  Loading  Statistics 


Procedure 

Statistic 

A 

B 

C 

Trips  Loaded 

482,027 

495,103 

483,443 

Vehicle  Miles 

Local  Links 
Other  Links 

93,545 
1,811,593 

96,856 
1,821,181 

93,754 
1,816,706 

Vehicle  Hours 

Local  Links 
Other  Links 

7,501 
57,755 

7,747 
58,215 

7,514 
57,902 

compared  to  load  C  are  7.91  and  903. 

No  restrained  loads  were  made  because  of  concern  for 
the  accuracy  of  the  link  count  data  (see  Chapter  IV)  and 
fear  that  the  effects  of  a  restraint  process  might  confound 
the  results  directly  attributable  to  differences  in  the 
travel  survey  data. 


36 


(M  ■»-  f«  & 
ec  «g  m  in 

CM  f»  (M  (M 

■f  ^?  •  CM 
lO  (M  ft-  © 
•  IMT  • 
■*■"■>■* 

rr   _  —  ,-- 


S£ 


CO 

Q 

< 

o 


UJ 

tr 

3 

a 

UJ 

o 
O 
cr 
o. 


UJ 

cr 

3 

a 

UJ 


5 

u. 
O 

z 
o 
c/> 

cr 

< 
a. 

O 
o 


UJ 

-I 

CD 


a  a 

UJ  Of 

7 

UJ 

o 

UJ 

a.      * 

ft- 

<  t/i 

*1 

3     • 

O  T 

> 

tr     • 

ii 

n 

Z 

<   ►- 

ii. 

n 

UJ  z 

U 

n 

X  uj 

.T 

« 

u 

-a 

r 

K 

r 

O  or 

a 

< 

O  uj 

< 

>- 

a   ev 

> 

to 

—  o 

(M 

«C*H 

1-    OE 

to 

o  a  — 

U"    Uj 

a. 

•   •  t- 

—    f«- 

3 

ft- 

d   IT     * 

C 

IT 

*     1     f 

<    I 

a 

IM 

<M 

t-   fr- 

O 

Pg 

co 

iz"  O 

cd 

Uj 

to 

I 

3  I 

3 

O  (- 

«j 

Uj    _ 

D 

II 

II     H     II 

Z    J 

> 

UJ 

UJ 

< 

X 

IUJV) 

■J   UJ 

3 

3  U  X 

_J  wo 

-j 

■J 

-i  z  z 

uj  O 

< 

C 

□   UJ  — 

U  X 

> 

>  a    _j 

lO    ¥~ 

CL 

UJ 

C 

UJ  U,  u. 

z   H 

u. 

to 

io  u.  C 

a 

LU 

<f  — 

1     UJ 

to 

t- 

ICOB 

U 

-j 

UJ 

CM    X 

< 

z  z 

ft- 

4 

*  «  X 

Uj 

a 

Uj 

u.iu3 

-j  to 

■ 

X  X  z 

* 
* 
• 

l* 

t^r5 

• 
• 
• 

W  t*^ 

|      t«»** 

• 

r>  c 

• 

f-  o  o 

»       o  o  c 
»      o  ©  o 

* 

o  o 

c 

• 

a  o  o 

# 

• 

* 
-J  * 

coo 

* 
* 

a  o  o 
a  o  o 

t  ggg 

<  * 

• 

ft-  * 

V. 

V. 

* 

*v  -v  V. 

»           Nk  V  V. 

c  * 

<M  C 

» 

o  -«  rvj 

»        ••  m  in 

»        —  K  o 

ft-  * 

* 

—  K  o 

• 

• 

r*    m*    fT\ 

»        P»  •■  (ft 

* 

• 

»          •    »    » 

* 

* 

f*   *   CD 

»        i*   a  « 

* 

• 

r-  C 

»         p*  a 

Uj 

* 

• 

r*  ^ 

»           m  hi 

Z 

* 

• 

* 

* 

a-  a- 

»                O  K 

-J 

* 
# 

• 
* 

X 

• 

* 

O 

* 

• 

ft- 

« 

UJ  * 

* 

D 
tr 

• 
* 

to  « 

<  * 

1* 

t*  t* 

• 
• 

H  »»«• 

;    t* ^t* 

« 

to  * 

PI 

c 

* 

tt    (Tl   ^ 

^             —    -r-    P- 

z 

• 

* 

o 

o 

c 

• 

H>         tr  —  — 

c 

• 

c  * 

» 

* 

h-  * 

© 

o 

o 

• 

IM  ty   CT 

►          Ni  0«   &*■ 

IO 

» 

* 

• 

fffrff 

►          0*   0*  0- 

Uj 

» 

-i  * 

• 

X 

• 

«  * 

N» 

"V 

* 

V.  "S.  V 

>             ^.   K.   V 

3 

• 

3  # 

<M 

o 

• 

(^   «  f^ 

ft             -*    C    ^ 

■J 

* 

cr  * 

* 

-*■*©■ 

»       -r  «  a- 

O 

• 

UJ  « 

• 

♦  Pxj  *<> 

ft         *  <M  *^ 

> 

• 

* 

• 

t/> 

# 

i-  • 

• 

W  ^*  IT 

>       (ft  m  ir 

Uj  U. 

• 

o  * 

• 

O  fN* 

ft              c  ftJ 

Z   to 

* 

z  • 

* 

<ri  r> 

►           (ft  ift 

3  < 

• 

* 

• 

i            •   • 

_J   CO 

* 

1-  # 

« 

cr  cc 

y           m  * 

O 

* 

o  to 

• 

>  u. 

• 

a. 

UJ  * 

• 

o 

• 

Uj 

ft-  * 

» 

UJ 

* 

Kl 

* 

* 

<   3 

• 

z 

* 

# 

cr  i/i 

# 
* 

c 

z 

• 
UJ  « 

l* 

r= 

l« 

• 

S*l*tf 

t*Ct»< 

o   » 

• 

io  * 

• 

O  ^   k 

»  r^  k 

Z   UJ 

« 

<  * 

c 

o 

c 

• 

-It      4 

1         —  c   cc 

—  z 

# 

CD  * 

# 

o  — 

• 

* 

o 

o 

o 

• 

^  o  o 

r^  o  e 

Z  _J 

* 

C  • 

* 

c 

• 

ft-  » 

* 

0.    Uj 

* 

• 

^ 

^. 

*K 

• 

V  X.  V 

V  **  *s 

to  _j 

* 

_j  * 

o 

c 

c 

• 

f*-  tr  ir      < 

»       h*  it  tf> 

uj  O 

* 

<  * 

• 

«  O  O       1 

»      <c  o  o 

a  o 

• 

s: 

• 

,\,  0.    (J- 

*<  9-  9- 

a  — 

# 

« 

*    » 

i                »    • 

o  x 

* 

UJ  • 

• 

tM  f^ 

ft                IM  IM 

u 

* 

• 

• 

f^  r* 

t                P*-  f* 

Z 

* 

*-  * 

• 

o  o 

• 

lO  * 

• 

1— 

UJ   * 

UJ  • 

• 

to 

z  * 

ft-   • 

• 

V    UJ 

3  # 

• 

• 

UJ  X 

_j  * 

• 

• 

3 

O 

Uj 

O  w^  0"  *  Ift 

Z1  u.  CD  <\j  — 


Z  -J  m  <C  0^ 

>  'v  -v  N. 

O       cc  (?■  r- 

a  wo  c  o 

uj  i/i  —  *-■  m 

K.     «         -         •  • 

<3-  -C  c  y 

Z         iC  IT  CM 

<  IL   CM  O  n 

ro  •»• 

w  m  to 


cr! 


t*  t»t 


X  3  O  # 
3  J  ># 

w  C         * 

§>  H-  « 
to  • 

ft-  UJ  • 


•    •    * 

UJ  • 

<o  • 


►  •  •  #  - 

c  c  c 
o  ©  © 


: 

COO        * 


o  o  o 


Hi** 

o  o  o 

•    •    • 

oo  o 


u-  O 

a 


c 

ift  rsi  o^  o 

~i 

ft-  ©  ftj    • 

u. 

^ 

-.    r^   —  -r 

o 

« 

*     »      •  IT 
4)  c  Nfl 

7 

bH 

r*  c  w  r*- 

o 

wv 

fft  fft    |    IM 

§ 

•  •      * 

•  m      ■* 

a 

Ml 

a 

* 

X 

*    *    n    n 

o 

*  *" 

f  fK 

*    *   UJ 

VIViU 

«■  a.  z 

at  o  e  uj 

•-  *■  uj  a 
u.  < 

K    O/U.    S 

W»  •«  ■•  o 

ft-  « 

«-  f  «  I 

O  O  3  S 


«  • 

3  • 
O  • 
UJ  • 

kJ 

O  • 

z  • 

<n  • 

uj  • 

OH  • 


UJ  • 
J!  • 

<  • 

• 
o  • 
i-  * 

t 
_>  » 
«  • 

§i 

.1 


•  •  • 

o  o  e 
■    •   . 

©  6  O 


s 

• 

t 


: 

: 


X   3  9   C*  C 

3di"irt> 

C  >  -v  -v  -v 

>  CC    CL   <0 

hNCC   « 

»-  lO  *   P-   CM 

iO  UJ  •     •     » 

u,  i-  j    a    "■ 

*-  a  O  O 

U,  CM  C  fft 

D  O  »    *     - 

Z  (ft  IT  ff 
<   X 

3 

UJ   CO 

to 


X       ft"*1 

O  X  *-  *  ft* 

CD  Z     •     •     • 

-NON 
X  mi  -r  ir  a 

—  It  'V  ^  ^ 

j*  O  Ift  ©  *ft 

*.  a:    ^ 
^  a  "~  x  ■* 

X5  S3  *-  —  (ft 


o  — 

ft*  o 

CC    © 

(M  <C 

^J  IM 

z  ^  -c 

1    — 

UJ 

00     t 

z  to 

r*  o 

G 

«J 

O  -* 

■c  © 

CD 

■J-  -0 

IM 

ft> 

•v 

4-   ■C 

CO   — 

UJ   > 

X  UJ 
3  O 

mi 

O  ft- 

UJ 

to 

•I 

CC 

1 

t*& 

—  fft 

>  z 

to 

o  ift 

IT 

UJ 

Uj 

t~ 

©  (^ 

IT 

ft-   CJ 

ft- 

z 

to 

UJ 

O  CC 

C 

uj  at 

• 

u 

IT  O 

rv 

t~  Uj 

o 

1    — 

a. 

o 

a 

It  a 


^r 

o 

mm 

Z 

s.T 

— 

rr 

n 

V 

en 

z 

IT 

* 

(M 

C 

? 

0- 

*M 

1 

,F 

I 

-i 

1 

CD 

| 

r 

UJ 

_ 

1 

in 

f- 

C; 

& 

u 

^ 

<e 

r 

■V 

Z 

Cf 

c 

u- 

fv. 

w 

<- 

(M 

Ql 

or 
c; 
u 

z 

u. 

a 

OJ 
U- 

r 

Uj 

4] 
X 

ft- 

O 

ft>- 

u 

lO 

41 

* 

* 

z 

kl 

© 

— 

r*-> 

►- 

3 

N- 

IM 

IM 

% 

Uj 

— 

I 

3 

C 

a. 

C 

n 

w 

(M 

> 

— ' 

e  cc 
t  •  t 
ceo 


si: 


z 

i 


l»ftt» 

ceo 
t  •  • 
o  o  c 


u    UJ 

K.   1 

I  s 

UJ2  J 

too 
5  z  > 

fr. 

»?? 

o-- 
fc  ?  « 

uj  »J"  «o 


u  C 

Z  B. 

*  I 

f  i 

pa  •- 

X  X 


39 


o-  ■*  r-  tr 
*nj  go  f-  o 

i«  <c  ■*  * 
f^  *  •  -- 
o  o  i-  jj 

.  —  -T      ■ 

o>  o>  to  f- 


O  a 
a  o 

uj  a 


•  o  c  c 

•  •   •   • 

•  o  o  o 


t*  ti* 


coo 


O  rr.  cy 

—  r-  o 

r-.     y  ff 

ec  cr 

^  i*< 


C   <*    (NJ 

—  r-  o 


C<c!t» 


ui  z  u  a 

a  u-  r-  < 
u  <  o 


a 

< 
o 


UJ 

<r 

3 
O 

UJ 

o 
o 
cc 


CD 
Ui 

cc 

r> 
a 

UJ 

o 
o 
cc 
a. 

u. 
o 

z 
o 
co 

E 

< 

0. 
O 

o 


UJ 

_l 
o 

< 


C  a  a  « 
C  u.  <  K 
a  a  >  to 


a.  • 

*      o  c  o 

o  •  •    •    • 

t-  *       o  o  c 

-I  • 


*  &-'    ?*  C4  *  C*    C5  ^ 


3  # 
o  * 

UJ  • 


—  a 
t—  a  ci 
*/i  u>  a. 


_  ■.">  — 

uj  O  * 
U  I 


U  -J 


^  f\  Ps,  » 
*  X  <M  f 
Aj   (\l 


II      II      II      It 

I      X      U.     • 

-    3  w  * 
w  ^  2  2 

5  5  Mrf- 
>  >  *  -j 

UJ 
KUilljIt 
t/i  *«  u,  C 


Z  Z  2  £ 

«  «  «  sr 

ui  ui  uj  3 

X  X  x  z 


u,  u. 
X  i^ 

c 

>  u. 


o  — 

Z  -i 
C 

Ck  tt. 


o  X 

w 

ui  Z 

»  3 
3-i  : 
-i  C 


I**  rsj  l> 
ac  0s 


—  f-  m 

•c  r^  «> 

tr  c  o 

ro  M  i> 

f«  cm 


,1     «^»*     ; 


••«#•* 


t/»  * 

<  * 

eC  # 


it  tr  o 


(TOO 


,  —         #  p- ■ 


UJ  #  UJ   * 

X  •  H  » 

3  •  • 

•I  •  * 

C  •  • 


u.  O 

o 

X 
Z  3 
O  i/» 


»«  (N*  ^*  h- 

r-  o  r*  <* 

tP  <*»  fNJ  CT 

•  .  »<fi 
r  a-  u-  C 
CO   C   CC   CD 

-r  f"  <- 
en  ec        • 


co  « 

o  # 


<  * 

O  * 


•  o  • 

*  z  • 

s   ,-: 

*  */i  * 

•  UJ  * 

•  o*-  • 

#  0£  • 

*  uj  •   •   •   ■ 


#  »   •  i 

ooo 

ooo 


•     t«  t*t*     • 


t*  4  o 

—  cr  c 

MtfMT 


•  *  *  *  * 

<0       — 
CO  o  o 

•    •     • 
coo 


0  CC  IT  t* 

a    ui  —   —    f*l 

aiinooo 

eo  r»  o  r- 

Z  fM  h-  o> 

<  U.  tf\  N  rsj 

1  O  -    ►   • 
K  ■*  rr  CD 


r=  v1 1' 


o  >  ^  -v  v 

>         If  Mh 

n-  *  --  r* 
h-  »/>  m  •*  c 

UJkOWN 

K       a  o  ec 

•     •     • 

Z         -r  n"    a 

<   X 
3 


•  * 

•  • 

*■   0-   — 

<o  •* 

^miro 

t£ 

fsj  *y 

Z  l>  o  */. 

Z  0*  f»- 

—  —  rsi  — 

UI 

il 

-Jill 

Z  Wl 

•J 

■*  f»  O 

o  * 

O  CD  O 

—  CD 

m 

-*           fS) 

h-   "V 

-* 

fSJ           ~+ 

<  — 

on  —  UJ 
UJ  >  Iff 

XUI< 

3  O  B 

-J          1 

O  ♦-  K 

O  -*  f^ 

>  z  V) 

o  o 

O 

UJ  UJ  H-  O  O 

UJ               ^ 

*-  u  *- 

z 

•    • 

X               • 

(/)                V 

uoo 

3             O 

uj  or  • 

u  o  o 

-J                 <M 

HUC 

—  <n 

P 

a  o 

EC 

1 

> 

o      — 

UJ 

<-n 


I  l/l 


o 
z 

O  Uj 


*  * 

-f  <D 

*C  I**  o 

z  m  tr 


I 

•j 


c^t1  ** 


U  IT  +  C* 
►- 

■-  U.  V.  >.  "V 

X  O  *  P*  — 
O  IT  4 

i/)  ac  o  irir 

^  uj  *    »    » 

2S 


a  cl 

CC   UJ  Uj 

D    LL    t/! 

u  u.  < 


UJ  UJ  o 
*  X  UJ 
^  3  K  UJ  »  C> 


cSo      «A 


CD  • 

# 

O  * 


<  » 
3  • 
3  # 
UJ  • 


o  o  o 

■     •     • 

o  o  o 


*#•••• 


>  o  o 

>  •   • 

>  o  o 


O        </>  Uj 

3IU4I/I 

l-CO< 

M  3         CD 

3  —  <  x 

•  •  *  • 

<  z  r  < 

roi-i  j 

■*      i-  < 

»«t*t« 

uj  x  or       t- 
>       Ui  l/i  o 

o  o  o 

—  UJ  1—   to  1— 

•     •     • 

f-    >   ■<    Uj 

ooo 

<  —  UJ   _l 

-1  *-   * 

lil<Ol- 

^  V  Vi 

m  -t     <a 

ooo 

UK   UJ 

1U7J 

sop 

3  Z  > 

&*■* 

O—  — 
K  Z  K 

Ui»  « 

Mir 


ui  a 

za 

z  z 

ii 

a 


fc0  nn  u 
&  ft  z 


N*  uj  u.  3 

I-  CD 


#  • 

#  •  • 

U.S 

ss: 

*X 


•  •  •  •  • 


•  •  • 


S      3 


» » •  i 

s    ! 


s 


I  i 


i 


40 


CHAPTER  VII 
TRIP  DISTRIBUTION 


General 


Using  base  year  data,  it  is  necessary  to  calibrate  a 
trip  distribution  model  for  subsequent  distribution  of 
forecast  year  trips.   Because  of  the  predominance  of  its  use 
in  distributing  internal-internal  trips,  the  gravity  model 
was  selected  for  application  in  this  research  project. 

The  objectives  of  this  research  project  did  not  dictate 
the  calibration  of  the  trip  distribution  model  for  all  pur- 
poses but  rather  called  for  an  investigation  to  determine  if 
different  results  of  the  calibration  process  would  occur  if 
the  different  sets  of  edited  travel  survey  data  were  used 
as  input.   This  investigation  was  restricted  to  the  home 
based  work  and  home  based  shopping  purposes. 

In  preparation  for  applying  the  gravity  model,  the 
impedance  dataset  (commonly  called  "skimmed  trees")  was 
updated  to  account  for  intrazonal  travel  time  and  terminal 
time.   P&A  purpose  triptables  (one  dataset  for  each  of  the 
three  editing  procedures  being  investigated)  were  built,  and 
the  required  P&A  cards  punched.   Using  the  updated  impedance 
dataset  and  the  three  triptable  datasets,  the  observed  trip 
length  distributions  by  purpose  were  obtained  for  each  of 
the  three  editing  procedures. 

A  manual  comparison  of  the  P&A  data  and  the  trip  length 
distributions  for  each  of  the  two  purposes  showed  very  little 
differences  between  corresponding  data  items  from  the  pro- 
cedure A  and  the  procedure  C  data.   There  were  differences 
between  the  procedure  B  and  the  procedure  C  data,  but  these 
appeared  to  be  somewhat  random  in  nature. 
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Results 

For  each  of  the  three  editing  procedures  and  for  each 
of  the  two  purposes  (i.e.,  a  total  of  six  times),  the  grav- 
ity model  was  applied  once  using  three  iterations  to  balance 
attractions.   A  single  set  of  "reasonable"  friction  factors 
was  used  for  all  three  home  based  work  distributions,  and  a 
second  set  of  "reasonable"  friction  factors  was  used  for  all 
three  home  based  shopping  distributions.   The  resulting 
effects  on  the  trip  length  distributions  are  summarized  in 
Tables  12  and  13,  and  the  manually  computed  ratios  of  the 
number  of  observed  trips  to  the  number  of  distributed  trips 
are  summarized  in  Table  14. 
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Table  12.   Home  Based  Work  Trip  Length  Distribution  Summary 
Statistics 


Statistic 

Procedure 

Trips 

A 

B 

C 

CD 

> 

M 
CD 
CO 
X) 
O 

Total  Trips 

Total  Trip  Hours 

Average  Trip  Length 
(Minutes) 

Standard  Deviation 

Total  Intra  Trips 

74,794 
16,303 

13.08 

5.49 

1,466 

69,729 
15,289 

13.16 

5.42 

1,425 

74,664 
16,265 

13.07 

5.49 

1,477 

13 
CD 
-P 

3 

xi 

■H 
M 

-P 
CO 
•H 
Q 

Total  Trips 

Total  Trip  Hours 

Average  Trip  Length 
(Minutes) 

Standard  Deviation 

Total  Intra  Trips 

74,794 
15,805 

12.68 

5.38 

1,063 

69,729 
14,863 

12.79 

5.43 

1,003 

74,664 
15,781 

12.68 

5.38 

1,082 

Table  13.   Home  Based  Shopping  Trip  Length  Distribution 
Summary  Statistics 


Statistic 

Procedure 

Trips 

A 

B 

C 

Total  Trips 

56,333 

59,593 

56,292 

13 
CD 

> 

M 
CD 
CO 

Total  Trip  Hours 

8,585 

8,975 

8,542 

Average  Trip  Length 
(Minutes) 

9.14 

9.04 

9.11 

XI 
O 

Standard  Deviation 

4.31 

4.33 

4.28 

Total  Intra  Trips 

4,261 

4,480 

4,377 

D 

Total  Trips 

56,333 

59,593 

56,292 

CD 
-P 
3 
XI 
-H 
M 

Total  Trip  Hours 

8,700 

9,258 

8,674 

Average  Trip  Length 
(Minutes) 

9.27 

9.32 

9.25 

CO 
•H 

Standard  Deviation 

4.42 

4.45 

4.42 

Q 

Total  Intra  Trips 

3,432 

3,467 

3,495 
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Table  14.   Ratio  Of 

Observ 

ed  To  Distru 

DUtea  xi 

rips  ny 

nine 

Increment 

Home  Based 

Work 

Home  Based  Shopping 

Procedure 

Procedure 

Time 

A     B 

C 

A 

B 

c 

1 
2 
3 

0.88 

0.75 

0.86 

1.24 

1.28 

1.25 

4 

1.15 

1.19 

1.13 

0.97 

1.09 

0.98 

5 

1.05 

0.96 

1.06 

0.90 

0.90 

0.90 

6 

0.70 

0.74 

0.71 

0.88 

1.02 

0.87 

7 

0.92 

0.92 

0.94 

1.21 

1.18 

1.22 

8 

0.79 

0.73 

0.79 

0.95 

0.97 

0.95 

9 

0.94 

1.02 

0.93 

0.98 

0.89 

0.97 

10 

0.94 

0.91 

0.95 

0.97 

0.89 

0.98 

11 

1.01 

1.06 

1.02 

0.97 

0.97 

0.97 

12 

0.97 

0.97 

0.97 

1.13 

1.10 

1.13 

13 

0.97 

0.87 

0.95 

1.01 

1.19 

1.01 

14 

1.10 

1.16 

1.09 

0.92 

1.00 

0.92 

15 

1.07 

1.06 

1.07 

0.83 

0.65 

0.85 

16 

1.25 

1.30 

1.26 

0.86 

0.62 

0.85 

17 

1.15 

1.23 

1.16 

0.89 

0.85 

0.86 

18 

1.00 

0.98 

1.00 

1.53 

1.53 

1.55 

19 

1.11 

1.07 

1.10 

1.53 

1.09 

1.53 

20 

1.20 

1.09 

1.20 

1.00 

0.79 

0.95 

21 

0.96 

1.09 

0.97 

0.63 

0.34 

0.67 

22 

1.09 

1.01 

1.05 

0.49 

0.66 

0.48 

23 

0.93 

1.08 

0.89 

0.68 

0.69 

0.71 

24 

1.23 

1.17 

1.24 

1.39 

1.53 

1.23 

25 

0.97 

0.98 

0.98 

0.62 

0.96 

0.61 

26 

0.94 

0.80 

0.98 

0.44 

0.27 

0.45 

27 

0.90 

0.64 

0.86 

1.33 

1.02 

0.97 

28 

0.79 

0.75 

0.75 

0.39 

0.72 

0.41 

29 

0.90 

1.07 

0.99 

0.00 

0.00 

0.00 

30 

0.84 

0.75 

0.83 

1.08 

2.11 

1.08 

31 

1.27 

1.66 

1.22 

1.41 

1.64 

1.41 

32 

1.28 

0.86 

1.33 

1.73 

3.28 

1.63 

33 

1.40 

0.93 

1.40 

0.00 

0.00 

0.00 

34 

1.16 

1.58 

1.08 

0.00 

0.00 

0.00 

35 

1.75 

1.74 

1.64 

0.00 

0.00 

0.00 

36 

3.65 

2.51 

3.75 

0.00 

0.00 

0.00 

37 

1.28 

0.76 

1.28 

38 

1.67 

1.00 

1.74 

— 

0.00 

^MB 

39 

0.60 

0.00 

0.56 

— 

0.00 

40 

0.95 

1.52 

0.91 

41 

2.94 

2.64 

2.94 

42 

— 

0.00 

— 

43 

0.00 

0.00 

0.00 

44 
45 

0.00   0.00 

0.00 
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CHAPTER  VIII 
TRIP  GENERATION 


General 


Using  base  year  data,  it  is  necessary  to  develop  mathe- 
matical models  which  will  predict  the  total  number  of  trips 
produced  by  or  attracted  to  a  zone.   Traditionally,  multiple 
regression  models  with  the  individual  observation  being 
either  a  zonal  total  or  a  zonal  average  have  been  used,  so 
this  was  the  method  selected  for  application  and  investigation 
in  this  research  project. 

It  should  be  borne  in  mind  that  the  objectives  of  this 
research  project  could  be  satisfied  simply  by  making  the 
determination  that  the  results  of  the  trip  generation  model 
development  phase  using  the  different  sets  of  data  corre- 
sponding to  editing  procedures  A,  B,  and  C  would  be  either 
reasonably  identical  or  else  significantly  different.   It 
was  not  necessary  to  do  all  of  the  analyses  normally  done  in 
arriving  at  final  equations,  nor  was  it  necessary  to  develop 
all  of  the  customary  equations.   The  equations  selected  for 
investigation  were  home  based  work  productions,  home  based 
work  attractions,  home  based  shopping  productions,  and  home 
based  shopping  attractions. 

The  proportion  of  the  total  trips  within  the  study  area 
made  for  the  purposes  of  home  based  work  and  home  based 
shopping  were  given  in  Table  6.   Because  the  survey  trips 
were  not  linked,  the  proportion  of  home  based  work  trips  is 
slightly  less  than  what  studies  with  linked  trips  report. 

In  developing  the  four  equations  for  the  two  purposes 
under  investigation,  some  of  the  data  were  discarded  before 
the  analysis  in  order  not  to  bias  the  results.   For  the 
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home  based  production  equations,  data  from  88  zones  were 
discarded  leaving  data  from  204  zones  (the  88  zones  elimi- 
nated consisted  of  all  those  with  30  or  fewer  total  dwelling 
units  plus  a  few  others  deemed  desirable  to  delete) .   In 
addition  to  a  few  other  zones  deleted  on  the  basis  of 
judgement,  those  zones  with  fewer  than  50  total  employees 
were  deleted  from  the  home  based  work  attraction  analysis 
and  those  zones  with  fewer  than  50  retail  sales  employees 
were  deleted  from  the  home  based  shopping  attraction  analysis, 
This  left  191  zones  and  147  zones,  respectively. 

Production  Equations 

Average  zonal  values  were  used  in  investigating  the 
effects  of  editing  procedures  upon  the  production  equations. 
The  dependent  variables  were  vehicle  trips  per  dwelling  unit 
(by  purpose  by  editing  procedure) ,  and  the  independent 
variables  used  were  persons  per  dwelling  unit,  cars  per 
dwelling  unit,  and  income  per  dwelling  unit  (no  labor  force 
data  were  readily  available) .   All  of  the  variables  were 
derived  from  the  travel  survey  data  and  thus  were  dependent 
upon  it.   The  means  of  the  various  variables  are  tabulated 
in  Table  15,  and  the  correlations  between  them  are  tabulated 
in  Table  16. 

BMD0  2R  was  used  to  build  the  six  production  equations. 
Typically,  cars  per  dwelling  unit  was  the  first  variable 
selected  in  all  six  equations.   When  examining  the  regression 
output  for  the  three  editing  procedures,  one  could  only 
conclude  that  the  results  obtained  by  using  procedure  A  data 
and  procedure  C  data  were,  for  all  practical  purposes, 
identical.   However,  the  results  obtained  using  procedure  B 
data  did  differ. 
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Table  15.   Production  Equation  Variable  Means 


Procedui 

"e 

Variable 

A 

B 

C 

HBW  Trips/D.U. 

1.315 

1.235 

1.314 

HBS  Trips/D.U. 

0.983 

1.027 

0.975 

Cars/D.U. 

1.450 

1.347 

1.448 

Persons/D.U. 

3.142 

2.865 

3.139 

Income/D.U.  (in  $1,000 's) 

8.405 

7.908 

8.407 

Table  16.   Correlation  Data  For  Home  Based  Production 
Equation  Variables 


i  Independent  Variable* 

Cars 

Persons 

Income 

Variable 

D.U. 

D.U. 

D.U. 

HBW  Trips 
D.U. 

0.730 
0.576 
0.738 

0.543 
0.416 
0.547 

0.542 
0.381 
0.553 

HBS  Trips 
D.U. 

0.440 
0.281 
0.448 

0.303 
0.220 
0.312 

0.235 
0.098 
0.258 

Cars 
D.U. 

1.000 
1.000 
1.000 

0.643 
0.472 
0.649 

0.756 
0.751 
0.764 

Persons 

D.U. 

1.000 
1.000 
1.000 

0.548 
0.484 
0.555 

Income 
D.U. 

1.000 
1.000 
1.000 

*In  each  cell,  procedure  A  data  is  on  the  top 
line,  procedure  B  data  on  the  middle  line,  and 
procedure  C  data  on  the  bottom  line. 
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Attraction  Equations 

Zonal  total  vehicle  trips  were  used  in  investigating 
the  effects  of  editing  procedures  upon  the  attraction 
equations.   Total  employment  by  zone,  employment  by  category 
by  zone,  and  retail  sales  were  the  independent  variables 
available  for  analysis.   None  of  these  variables  came  from 
the  travel  survey  data,  so  only  the  dependent  variables 
differed  for  the  various  editing  procedures. 

BMD02R  was  used  to  build  the  six  attraction  equations. 
A  satisfactory  result  was  obtained  with  the  single  variable 
total  employment  in  all  three  home  based  work  equations,  and 
a  similar  result  was  obtained  with  the  single  variable  retail 
sales  employment  in  all  three  home  based  shopping  equations. 
The  results  of  applying  those  single  variable  equations  to 
zones  with  different  numbers  of  employees  are  summarized  in 
Tables  17  and  18,  respectively. 


Table  17.   Estimated  Home  Based  Work  Attractions 


Number 

•  Of  TOtc 

l1  Employees 

Procedure 

50 

400 

1,000 

2,000 

3,000 

A 

78 

395 

939 

1,845 

2,752 

B 

78 

368 

866 

1,696 

2,525 

C 

78 

394 

936 

1,839 

2,743 

Table  18.   Estimated  Home  Based  Shopping  Attractions 


48 


Number  of 

Retail  Sales  Employees 

Procedure 

50 

100 

200 

500 

1,000 

A 

165 

349 

717 

1,822 

3,662 

B 

173 

371 

767 

1,955 

3,936 

C 

166 

349 

716 

1,817 

3,651 
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CHAPTER  IX 
EFFECTS  OF  DIRECTIONAL  SPLIT  ASSUMPTIONS 


General 

In  investigating  the  effect  of  an  assumed  50-50  direc- 
tional split  in  converting  triptables  from  P&A  format  to 
O&D  format,  only  the  "best"  data  (i.e.,  procedure  C  data) 
were  used.   A  total  purpose  O&D  triptable  was  built  from 
trip  cards,  and  P&A  triptables  were  built  from  the  same  set 
of  trip  cards.   The  P&A  triptables  which  were  built  were  home 
based  work,  home  based  shopping,  home  based  other,  all 
non-home  based  (including  those  trips  inventoried  in  the 
external,  truck,  and  taxi  surveys),  and  total  purpose.   All 
triptables  consisted  of  ADT  vehicle  trips. 

From  the  one  merged  P&A  triptable  dataset,  two  new 
total  purpose  triptables  were  built.   The  first  of  these  was 
obtained  by  splitting  each  of  the  three  home  based  purposes 
50-50  and  adding  in  the  non-home  based  trips.   The  second 
of  the  new  triptables  was  built  by  applying  the  computed 
areawide  average  directional  splits  (52.71-47.29,  47.81-52.19, 
and  54.10-4  5.90,  respectively)  to  each  of  the  three  home 
based  purposes  and  adding  in  the  non-home  based  trips. 

Comparison  of  Network  Loads 

All  three  total  purpose  triptables  (O&D,  50-50  split 
P&A,  and  correctly  split  P&A)  were  separately  loaded  onto 
the  network  and  the  loads  compared. 

Table  19  compares  the  directional  link  volumes  obtained 
from  the  50-50  split  triptable  to  those  from  the  correctly 
split  triptable.   The  maximum  volume  difference  on  any  link 
in  the  network  was  250  trips,  and  the  per  cent  root  mean 
square  error  was  1.79. 
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Table  20  compares  the  directional  link  volumes  obtained 
from  the  correctly  split  triptable  to  those  from  the  O&D 
triptable,  and  Table  21  makes  the  corresponding  comparison 
using  the  50-50  split  triptable.   The  maximum  volume  dif- 
ferences on  any  link  in  the  network  were  302  and  290, 
respectively.   The  corresponding  per  cent  root  mean  square 
errors  were  2.35  and  2.15,  respectively.   Thus,  the  50-50 
split  triptables  matched  O&D  loads  slightly  better  than  the 
"correctly"  split  triptables  (this  seeming  paradox  is 
undoubtedly  due  to  errors  introduced  by  using  areawide 
average  directional  splits) . 

In  passing,  Table  22  dramatically  points  out  the  neces- 
sity of  converting  P&A  triptables  to  O&D  format  before  making 
any  comparisons  against  O&D  assignment  data.   When  a  P&A 
triptable  was  loaded  onto  the  network  and  compared  against 
the  loads  obtained  by  using  the  corresponding  O&D  triptable, 
the  maximum  volume  difference  on  any  link  jumped  to  4,777 
with  a  33.09  per  cent  root  mean  square  error. 
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CHAPTER  X 
ADDITIONAL  ERROR  ANALYSES 


Residual  Incorrectly  Recorded  Data  Errors 

As  pointed  out  in  Chapter  I,  computer  error  checking,  no 
matter  how  thoroughly  done,  only  detects  a  portion  of  the 
total  number  of  incorrectly  recorded  data  errors.   This  is 
because  many,  if  not  most,  such  errors  result  in  legitimate 
but  erroneous  codes  which  pass  the  error  checks. 

In  order  to  get  a  better  understanding  of  the  true 
number  of  incorrectly  recorded  data  errors  and  their  causes, 
it  was  decided  to  go  back  and  recode  a  statistical  sample  of 
the  original  interview  data  and  compare  the  recoded  data 
with  the  originally  coded  data.   It  was  determined  that  a 
systematic  sample  would  be  statistically  valid,  and  thus  1 
in  10  of  the  Lafayette  home  interviews,  1  in  8  of  the 
Evansville  home  interviews,  1  in  10  of  the  Evansville  exter- 
nal interviews,  1  in  6  of  the  Evansville  truck  samples,  and 
1  in  2  of  the  Evansville  taxi  samples  were  recoded. 

Since  the  original  coding  appeared  directly  on  the 
interview  forms,  it  was  not  practical  to  independently  recode 
the  data.   However,  the  personnel  doing  the  recoding  were 
strongly  urged  to  ignore  the  original  coding  and  their  work 
was  frequently  spot  checked  to  insure  that  they  were 
performing  adequately. 

After  the  data  were  coded,  they  were  digitized  and 
compared  against  the  originally  coded  data.   Discrepancies 
were  noted  and  resolved  by  subsequent  reference  to  the 
original  interview  data,  and,  finally,  error  rates  were 
computed  (see  Table  23) .   Because  in  a  strict  sense  the  data 
were  not  independently  recoded,  it  cannot  be  asserted  that 
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the  tabulated  error  rates  are  "true"  or  "total"  error  rates. 
However,  it  is  felt  that  they  approximate  the  total  error 
rate.   The  significant  point  is  that  the  total  number  of 
incorrectly  recorded  data  errors  is  a  few  times  the  number  of 
such  errors  which  can  be  found  by  computer  error  checking. 
Computations  disclose  that  the  errors  are  not  randomly 
distributed  but  tend  toward  the  most  frequently  occurring 
values. 

An  analysis  of  the  causes  of  the  various  errors 
discloses  that  "procedural"  errors  are  by  far  the  major 
cause  of  such  coding  errors  (for  example,  see  Table  24) . 
Errors  such  as  picking  up  a  zone  code  from  the  wrong  line  of 
a  coding  index,  picking  up  the  zone  code  for  an  even  numbered 
street  address  when  the  actual  number  is  odd,  and  incorrectly 
using  the  destination  zone  number  of  the  preceeding  trip  as 
the  origin  zone  number  of  the  next  trip  are  examples  of 
procedural  errors. 

Higher  error  rates  are  found  where  the  coder  must 
exercise  judgement,  such  as  in  the  determination  of  a  land 
use  or  commodity  code.   When  the  coders  apparently  thought 
that  there  was  small  necessity  for  a  correct  code,  it 
frequently  appeared  that  they  were  excessively  careless  in 
their  code  selection. 

There  were  many  occurrences  of  incorrect  conversion  of 
time  to  the  24  hour  clock  used  in  both  the  Evansville  and 
Lafayette  studies. 

Table  24.   Breakdown  Of  Errors  In  Coding  Of  Trip  End 
Addresses  By  Causes 


Bad  Interview 

Information  4.9 

Procedural  Errors  64.2 

Discrepancy  in 

Reference  Material  3.1 

Digit  Substitution  22.6 

Transposition  2.8 

Omission  of  Digitus)  2.4 


Evansville      Evansville 
(except  external)     External   Lafayette 


36.3 

4.2 

26.1 

60.1 

1.7 

5.8 

21.5 

21.3 

3.5 

5.5 

10.9 

3.1 

58 

Substantially  higher  error  rates  were  found  in  the 
coding  of  the  external  survey  data  on  mark  sense  forms. 
This  was  partially  due  to  the  coders  having  to  "manufacture" 
data  which  the  field  interviewers  (sometimes  understandably) 
failed  to  obtain,  but  primarily  it  was  a  function  of  the 
great  difficulty  in  coding  mark  sense  forms. 

Almost  all  incorrectly  recorded  data  errors  were  made 
by  the  coders.   Relatively  few  were  caused  by  bad  interview 
information,  and  even  fewer  were  caused  by  those  keypunch 
errors  which  survived  key  verification. 

Keypunching  Errors 

With  the  exception  of  the  Evansville  external  data,  all 
of  the  travel  survey  data  which  were  recoded  as  described  in 
the  previous  section  were  keypunched.   Thus  there  was 
opportunity  for  investigating  keypunch  errors,  and  it  was 
decided  to  do  so. 

The  travel  survey  data  were  initially  keypunched  but  not 
verified,  after  which  duplicate  sets  of  cards  were  mechani- 
cally made.   After  the  original  sets  of  cards  were  subsequently 
verified,  the  corresponding  sets  of  cards  were  compared.   The 
three  keypunch  shops  involved  were  not  told  the  reason  for  the 
two  step  process,  but  they  may  have  surmised  it.   If  so,  one 
would  hypothesize  that  they  would  have  attempted  to  be  more 
accurate  than  usual  in  the  original  keypunching  in  order  to 
avoid  disclosing  an  excessively  high  error  rate. 

Table  25  summarizes  the  discrepancies  found  between  the 
corresponding  sets  of  cards.   The  per  cent  of  keystrokes  in 
error  is  actually  the  per  cent  of  keystrokes  in  error  which 
were  removed  by  key  verification  (a  small  number  of  errors 
remained  after  key  verification,  some  of  which  showed  up  in 
the  analysis  of  the  recoded  data) . 

Three  critical  errors  do  not  show  up  in  Table  25. 
Through  human  error  or,  more  likely,  through  mechanical 
malfunction,  three  of  the  unverified  cards  contained  an 
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invalid  combination  of  punches  in  a  given  column.   These 
errors  prohibited  the  cards  from  even  being  read  by  the 
computer  and  thus  presented  serious  processing  problems. 
All  of  these  errors  were  found  and  corrected  during  key 
verification. 
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CHAPTER  XI 
SUMMARY  AND  CONCLUSIONS 


Results  of  Using  Various  Editing  Procedures 

The  analysis  of  the  results  found  in  this  study  involv- 
ing the  travel  survey  data  collected  in  Evansville,  Indiana, 
clearly  shows  that  no  significant  differences  are  found  in 
the  base  year  traffic  assignment,  trip  distribution,  and  trip 
generation  models  between  a  procedure  using  unedited  data  and 
one  using  thoroughly  edited  and  corrected  data.   However, 
substantial  operational  difficulties  are  encountered  in 
getting  some  of  the  computer  programs  (e.g.,  the  programs 
which  inserted  expansion  factors)  to  perform  successfully 
when  using  totally  unedited  data  as  input. 

In  comparison  to  the  results  which  are  obtained  by 
following  either  of  the  above  editing  procedures,  signifi- 
cantly different  results  are  found  in  the  three  travel  models 
when  an  editing  procedure  is  followed  which  calls  for  the 
deletion  of  an  entire  sample  of  clustered  trip  data  whenever 
an  error  is  found  in  the  sample.   Two  possible  causes  for 
the  resulting  differences  were  observed.   First,  because  of 
minor  coding  errors,  an  inordinate  number  of  non-completed 
interviews  was  rejected  thus  biasing  the  computed  expansion 
factors  in  favor  of  completed  interviews.   Second,  the  normal 
discarded  sample  was  not  representative  of  the  entire  popula- 
tion of  samples.   It  typically  contained  more  than  the 
average  number  of  trips  and,  in  the  case  of  the  home  inter- 
view survey,  the  household  had  higher  than  average  number  of 
persons,  number  of  cars,  and  income. 


62 


Although  only  a  portion  of  the  total  number  of  incor- 
rectly recorded  data  errors  are  detected  by  computer  editing, 
one  would  not  expect  to  get  significantly  different  results 
even  if  he  were  able  to  remove  all  such  errors. 

Substantially  more  trip  cards  failed  consistency  checks 
then  failed  single  field  error  checks.   However,  all  data 
errors  causing  operational  problems  in  the  use  of  the  various 
computer  programs  were  detected  by  single  field  error  checks. 
The  consistency  checks  were  more  difficult  to  implement  in  a 
computer  program  and,  for  those  cards  rejected,  provided  the 
greater  difficulty  in  determining  the  necessary  corrections. 

Directional  Split  Assumptions 

When  working  with  unlinked  ADT  trips,  an  assumed  50-50 
directional  split  for  converting  a  triptable  from  P&A  format 
to  O&D  format  is  justifiable.   While  the  observed  differences 
might  be  slightly  greater  if  linked  trips  were  used,  it  would 
be  reasonable  to  expect  that  the  errors  introduced  by  an 
assumed  50-50  directional  split  would  still  be  negligible. 

Computer  Program  Contributions 

During  the  conduct  of  this  project,  it  was  necessary  to 
completely  rewrite  programs  REFORM,  COMPARE,  and  SPLIT  from 
the  Federal  Highway  Administration's  IBM  360  battery  of  urban 
transportation  planning  computer  programs.   The  corresponding 
source  decks  and  documentation  have  been  furnished  to  the 
Federal  Highway  Administration  for  incorporation  in  the 
forthcoming  version  of  that  battery.   Similarly,  they  have 
been  furnished  minor  corrections  to  several  other  computer 
programs,  including  GM  and  TDIST. 
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CHAPTER  XII 
GUIDELINES  FOR  DATA  PROCESSING  AND  EDITING 


The  following  paragraphs  consist  of  notes  and  guidelines 
which  reflect  acknowledged  good  practice,  the  experiences  of 
the  author  in  earlier  transportation  studies,  and  the  findings 
of  this  research. 

Notes  on  Form  and  Code  Design 

The  Indiana  State  Highway  Commission,  as  well  as  other 
agencies  involved  in  the  collection  of  travel  survey  data, 
should  consider  redesigning  some  of  its  travel  survey  inter- 
view and  coding  forms  with  the  following  considerations  in 
mind: 

1.  Those  data  fields  common  to  both  the 
dwelling  unit  summary  and  internal  trip  report 
forms  (e.g.,  sample  number,  card  number,  and 
residence  zone)  should  be  located  in  the  same 
columns  on  each. 

2.  Provision  should  be  made  on  all  but 
external  interview  forms  for  three  digit  trip 
numbers  and  number  of  trips.   Possibly  the 
size  of  other  fields  (e.g.,  number  of  persons 
regularly  enrolled  in  school)  also  needs  to  be 
increased. 

3.  Since  procedural  coding  errors  were  found 
to  be  the  major  cause  of  incorrectly  recorded 
data,  simplicity  of  form  design  is  essential 
(this  definitely  applies  to  geographic  coding 
indices  as  well  as  to  travel  survey  forms) . 
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Similar  consideration  should  be  given  to  the  following 
points  in  designing  the  codes  to  be  used  on  the  forms: 

1.  Never  use  a  code  in  which  a  blank  field 
is  a  legitimate  response  (such  as  for  an 
unknown  land  use) ,  for  there  is  no  way  that 

a  computer  program  can  subsequently  determine 
whether  the  intended  response  was  indeed  a 
blank  or  whether  another  response  was  intended 
but  omitted  (omitted  codes  are  a  non-trivial 
type  of  coding  error) . 

2.  If  possible,  a  code  consisting  of  all 
zeros  in  the  given  field  should  be  avoided,  as 
it  sometimes  requires  more  complicated  computer 
programming  to  be  able  to  differentiate  between 
a  field  of  blanks  and  a  field  of  zeros. 

3.  When  using  an  IBM  state-county  or  similar 
code  for  external  zone  numbers,  it  should  be 
decided  and  stated  in  advance  how  to  code 
responses  in  which  only  the  state  is  indicated. 

4 .  Serious  thought  should  be  given  to  using 
a  normal  12  hour  clock  time  with  AM  or  PM 
suffix  instead  of  a  24  hour  clock  time. 
Apparently  due  to  unfamiliarity,  many  inter- 
viewer and  coder  errors  were  made  in 
recording  time  to  a  24  hour  clock. 

5.  Consideration  might  be  given  to  applying 
some  of  the  results  of  the  psychological 
research  into  the  nature  of  errors  discussed 
in  the  Review  of  Literature  in  an  attempt  to 
design  codes  which  result  in  the  least  amount 
of  incorrectly  recorded  data  errors. 

6.  The  use  of  a  single  card  column  to  record 
two  or  more  distinct  data  values  through  the 
use  of  "multiple  punches"  should  be  avoided. 
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Guidelines  for  Data  Processing 

Data  digitized  through  a  keyboard  should  be  key  verified. 

When  the  output  from  the  digitizing  process  is  punch 
cards,  the  first  data  processing  step  involving  the  computer 
is  to  copy  the  punch  cards  onto  some  magnetic  recording 
medium  (normally  a  high  density  magnetic  tape)  which  will 
reduce  their  physical  volume  and  facilitate  subsequent  computer 
processing.   It  is  advisable  to  insert  a  record  sequence 
number  into  each  output  card  image  so  that  the  original 
sequence  can  be  determined  after  subsequent  processing  has 
altered  the  sequence. 

Also,  leading  zeros  should  be  inserted  in  at  least  the 
major  and  perhaps  the  minor  sort  fields  in  order  to  insure 
that  the  card  image  will  be  positioned  correctly  after  the 
initial  sort  (this  precaution  is  advisable  even  if  leading 
zeros  were  supposedly  coded  and  digitized  as  some  will  prob- 
ably be  missing) .   Finally,  efficient  data  processing  may 
dictate  that  the  card  layout  used  for  coding  and  digitizing 
be  altered.   Computer  programs  are  available  which  perform 
this  process  of  copying  the  original  data  cards  onto  a 
computer-readable  magnetic  recording  medium  while  simula- 
neously  reformatting  the  data  and  inserting  leading  zeros 
and  record  sequence  numbers. 

It  would  be  imprudent  to  entrust  travel  survey  data 
collected  at  great  expense  to  old  and/or  poor  quality  magnetic 
tapes  (or  similar  storage  volumes) .   The  cost  of  good  tapes 
certified  at  high  density  and  full  width  has  dropped  signifi- 
cantly, and  it  would  be  advisable  to  obtain  a  sufficient  number 
of  such  tapes  prior  to  beginning  computer  processing.   Even 
with  such  tapes,  however,  consideration  should  be  given  to 
having  them  initially  certified  and  periodically  cleaned. 
The  certification  process  tests  the  quality  of  the  tape  but 
destroys  any  data  previously  recorded  on  it.   The  cleaning 
process  simply  removes  lint  and  dust  from  the  surface  of  the 
tape  without  destroying  the  magnetically  recorded  data. 
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In  transportation  studies,  the  number  of  sets  of  data 
requiring  storage  on  a  tape  or  similar  volume  increases 
rapidly  as  the  study  progresses  through  the  editing,  factor 
insertion,  and  model  development  phases.   As  a  result,  there 
will  be  temptation  to  rapidly  scratch  (i.e.,  effectively 
destroy)  "old"  sets  of  data,  but  sound  data  processing 
practice  dictates  retaining  at  least  one  and  preferably  two 
or  more  of  the  immediately  previous  generations  of  data  so 
that,  if  necessitated  by  human  and/or  mechanical  failure, 
the  current  generation  could  be  recreated.   Similarly,  wise 
planning  requires  maintaining  duplicate  copies  of  critical 
sets  of  data  and  programs  at  a  separate  physical  location  in 
case  the  primary  site  is  destroyed  or  damaged  by  some 
disaster. 

Complete  and  accurate  record  keeping  is  a  necessity, 
particularly  since  the  number  of  sets  of  data  will  be  quite 
large.   For  each  individual  set  of  data  on  a  given  storage 
volume,  these  records  should  include  such  information  as  the 
date  generated,  how  generated,  what  the  input  data  were, 
how  many  records  are  in  the  set  of  data,  and  what  other  sets 
of  data  have  been  created  using  this  set  as  input  and  on 
what  dates.   These  records  should  be  retained  even  after  the 
data  have  been  scratched  as  it  may  be  necessary  at  some  later 
date  to  trace  the  ancestry  of  a  given  set  of  data. 

Guidelines  for  Computer  Editing 

The  analyses  for  which  the  travel  survey  data  are  to  be 
used  should,  if  possible,  be  determined  prior  to  editing. 
The  editing  procedure  to  be  used  should  then  incorporate  the 
minimal  number  of  error  checks  consistent  with  the  uses  to 
be  made  of  the  data. 

Assuming  good  quality  control  in  the  interviewing, 
coding,  and  digitizing  operations,  the  only  consistency 
checks  which  are  necessary  in  terms  of  the  accuracy  of  the 
resulting  traffic  assignment,  trip  distribution,  and  trip 


67 


generation  models  are  those  which  verify  that  the  number  of 
total  trips  and  the  number  of  auto  driver  trips  (if  applic- 
able) in  a  home  interview,  truck,  or  taxi  sample  is  correct. 
For  other  analyses  such  as  a  parking  study  based  on  O&D  data 
(44),  other  consistency  checks  might  be  justified.   Each 
additional  consistency  check  made  in  home  interview  survey 
data  will  cost,  on  the  average,  about  thirty-five  dollars 
per  10,000  trip  cards  plus  result  in  the  loss  of  additional 
calendar  time  before  the  data  are  available  for  analysis 
purposes.   Similar  costs  and  time  losses  could  be  expected 
with  other  travel  survey  data. 

Because  of  problems  which  might  otherwise  result  in  the 
operation  of  various  computer  programs,  it  is  advisable  to 
make  all  of  the  single  field  error  checks  in  those  fields 
which  contain  data  which  will  actually  be  used.   The  checks 
made  conceivably  could  be  limited  to  those  which  insure  that 
no  invalid  characters  or  embedded  blanks  exist  in  the  given 
field,  but  it  would  be  best  to  verify  that  the  contents  of 
the  field  represent  a  valid  number  or  code. 

Clustered  trip  data  failing  the  error  checks  should  be 
either  corrected  or  replaced  by  "good"  data.   Only  rarely 
should  the  entire  sample  (or  person)  be  discarded.   Well 
thought-out  procedures  for  compensating  in  the  expansion 
factors  for  discarded  data  must  be  developed  and  applied. 

After  corrections  have  been  made  to  travel  survey  data 
containing  errors,  the  updated  data  should  be  recycled 
through  the  error  checking  process  to  insure  that  the  changes 
made  were  correct  and  did  not  introduce  other  error  failures. 

If  interview  rates  at  the  external  stations  are  fairly 
high,  the  corresponding  percentage  of  error  check  failures 
is  relatively  small,  and  the  distribution  of  the  error  check 
failures  is  fairly  random  with  respect  to  station,  direction, 
hour,  and  vehicle  type,  then  consideration  should  be  given 
to  simply  discarding  the  records  in  error. 
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CHAPTER  XIII 
RECOMMENDATIONS  FOR  ADDITIONAL  RESEARCH 


The  results  obtained  in  this  research  project  and  some 
of  the  reviewed  literature  both  indicate  that  the  "automatic" 
correction  by  the  computer  of  travel  survey  data  failing 
error  checks  would  probably  save  time  and  money  in  the 
editing  phase  while  not  having  a  detrimental  effect  on  the 
accuracy  of  the  resulting  models.   Such  a  methodology  should 
be  developed  and  tested. 
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APPENDIX  A 
SURVEY  FORMS 


The  following  figures  are  photo-reductions  of  the  actual 
forms  used  for  coding  the  travel  survey  data  from  the 
Evansville  and  the  Lafayette  transportation  studies  (with  the 
exception  of  the  Lafayette  external  survey,  the  same  forms 
were  also  used  for  recording  the  original  interview  data) . 
Unless  otherwise  specified  in  the  caption,  the  same  form  was 
used  in  both  Evansville  and  Lafayette. 


76 


a 

• 

a 

a 

a 

' 

- 

3 

B 

a 

3 
E 

3 

i 

E 

D 

7 

s 

E 
: 
z 

s 

! 

Z 

1 
E 

D 
Z 

a 

■ 

«■ 

o 

N      (3 


>- 

< 

3 
to 

Z 

3 
O 

z 


5 

O 


>- 

a 

3 
»— 
i/> 

Z 

o 


O 

a. 
•si 

Z 

< 


*  8 

I  < 

2  S 

I  > 

I  1 


->     *     J     S     Z     O     a. 


*l 


is: 


'I 


K 


\U 


m 


2 

o 
u. 

Q 
CC 

< 

u 

>- 

OC 

< 


3 
CO 


2 
3 

O 

z 


o 

UJ 


n 


5   i 


«  s 

E 


> 
to 


ct 

3 
O 


77 


I 


Mb 

2 


LU 

at. 

a. 


m     !M  *IJ. 

s    s"           lllll 

—    I       -^-. 

*****  ****, 

nihil 

it 

Upiii; 

*;JM,rt*^*T.ii*             d-M-l 

siiihii* 

144*144            4^4.4444^44 

-    i. 

n 

s 

H 

G 

D         D 

S 

i'  i 

i 

* 

i               8     6 

1 
"1 

!         x 

I     ~3 

1    1 

t 

X 

"      1 

t                      a 
a                      a 

1          '    1       1          ' 

;  !       1 

-i  1  .iilil 

ii  niii 

*****       g  ***** 

111)             j 
Hill         lllll 

*****           ****** 

111 

*****      - 

llill  il 

<*  -1  •)  wi  4  *  4  - 

IL  niii 

**         -  d-*H, 

Ii  f             III  t 

iiiLiilillllii 

rf  d  pi  *  4        ***********        - 

£           1    1 

s       _      < 

3           o 

t 

i 

% 

1 

a 

I                 ! 

* 

1         1 

It 

11 

* 

3                                  S 
I               *  i            I               S 

a                       a 

u.     "1 

3          a    ' 

* 
i 

• 

1        Jl 
• 

• 

1      "1 
* 

I         1 

s 

Jl 

• 

»                                  5 
I                *  J             I                * 

3                      a 

-    i*  ml. 

1  oilffJ 

111  L-i!lfi 

i4*-4                     -.•irtd* 

ii,  nijii! 

!i,q 

*** 

Mi,  cijji! 

-laMd*td»d                  44.4  4 

fill  L- Ullilil  •- 

ildf'd                  -  *]  <4  d  ■*  d  "■'  «* 

i 

U      II          1 
ui  •   *  j 

5.8                J 

S       £ 

to              Z                           ■ 

s     « 

y. 

3 

Ii    1 

3 

Id  i 

'B  ' 

9 
*       1 

Q 

■ 

! 

1 

9 

\-\      1 

1           ■ 

9                               3 

i  :  I  i  : 
1*1- 

J     '       I      s     « 

a                      a 

Ill 

*   i 

-  r   ! 

* 

!  ~      ! 

i   " 
* 

I]  1 

*       * 

■ 

|] 

i 

| 

I]    1 

'B  ' 

a                       a 
a         *              a 

2  MOUtAN 

3  *»wi 

« 
* 

i 
* 

_• 

a 

_a                   j 
a                     a 

< 

6 

B 

E 

i 

d       q 

*>     vopMftrao 

1 

1 

"" i 

jj 

"" a                  ~ a 

B                           i 
f-       .         **«»n    f| 

— -                     ..  —  uO      j  ^ 

tit 

-i«4 

111 

! 

u        nn 

—  **        i— 

K      -                  *» 
w                      i*n»V 

CD 

t 

g 

I 

a 

i 

i 

a 

a                      a 
a                     a 

z                       urn 

g  ~     .....  i»»»i 

,1                -■""• 

o           i—   Mlfl 
i                 0  l.i.l 

4           '— ™  ***** 

o           r—  Hill 

i             liiiii 

•  n 

*  ***-- 

>|I|«S 

*■**<« 

■1  *■««•<« 

—  I><z 

1  iiii 

1 iirf44 

>o         [—  *|ii*e         p- 

\i                  *  Itllill.                  > 

.*                L—    4«-..<                l— 

■i            M*M1I 

a 
I 

s 
"J 

* 

a 

X 

a                   j 
a                  ~i 

s 

o 
u. 

Q 

cr 

< 
o 

a. 
a: 


< 

z 

IT 

LU 


> 

z 
< 
> 

LU 


< 


O 
Gl 


78 


EXTERNAL  INTERVIEW  E6RM 


T1SI 


■»«■  ■■ «- 


COST  ACCOUNT. 


INTfRVIt  Af   8*  - 


<tAI      <W:: 


L 


L 


mi.      naj   •  :n::      a: 
axt:      r»n 


I    '   v^li  I.:..'.   .::..   .....   _.  _-  :::::   :::: 

I  J      =W>1-tt?V'  F!:Trii=:?ii(vlSrgm>    ":::  ":::    ::" 

:W>!   *W    ::="     =""      '"-== "}-_  :l":  _="; 

:em  :»"'T)IT?      =      =~ U":  "="  _ :::r 

:VXn(?W.t":-  _  :u."    -l:::! ."v:  :""     ::" 

ma      am     am 


1  4  O  S  «  I  « 

aa     ass     en     ran     cm      H  =3  sa  em  am 

G 

am     am     nal     am     am       /  ma  am  am  a== 

£_    JL    .1-     3       4       o  ,*,.  .  «■,.  7       ' 


am      ::=:      ::S:  =«: 

an     am     um.  k:»  ^:».    ,, 

:^j      KKJ      am      :;a:  ~- 

=™  •<*-*■    -W1    "-"  ~ 

ar:»     Itm  •  ,am      aa:  am 


asi:      a*      am      ::* 


1 


1  2         J         4 

— —     1 1  in     mn  .  sfss 


5  6  7  1' 


L;      ;ai     •.nu'  najr  hror 


F^rOI'l  .-:::: 


/    :,»w..t)n,.i:0|r  Br:  LH*  I  ruck  B"4»  \1*:: 

."Wl      o="      "-•"      "-aa 


1.  Vchic'c..l.y|ui:. 


;ufl-      ijrit 


3.  Wh»p  Oiri  T«*r  Trip  Bpgin' 
:::»r)rtl»Sv""    ~    "'■'■'■ 


era  mu     -n      rn:     irax                 am     sxa      tun     mo     ma 
DI2J40S47IV 

laa  r— n     Tar:      era      am        H       cap     am     am     urn     am 

en  act     ran     am     taa        7        tea      ma  v  ma      am      ma 

0  I3.1«q667I* 

am  tat:      ma     n:a     ma        *.       am     am     ma     an:      as 

ma  :aa     am     am     mn        ^        aa:     am     ma     ara     am 


a.-a     can  ■ 


L     im        U       am     ma     ma     am     am 


5.  Trip  Puiyo»»ron«=     '"" 


icvl    •e.^'s    )!'../    \.: 


7.  W^rere  Roes'Ciir  Dr.:  Lfrc'  Truck  Baiocf "    ::::: 

;WH9W?: iv.: :::::      "::: :::::      :::::      ::::: 


«=--»      W    C?FT      "^      "■  = 


8  HcwMwiy  l*ttii  r'SRiia'Ifi  Ari-a?~== 


l1    W'l'ft-"'  ^J*l"il«r      aa: 
7"  att:      am      :at=      :m:      ::*■• 

,V  wt»  i-n9*tl  T=«*r.  f*tT>  :^«J", 


:ac:      ak=      :*:      :*:      :«J   


5       *       7       J       <; 


mrialaamam       0       amamaLatam 


aSaamamaiaam       U       amamamam 


6.   fmyP,HT»il'I':T=('  ::::- 


l-.u  s:  — 


9.  RDUtcrrf'-F.^irv  ar^xir::: 


«::       :m:      :?::       :Ja       ^:: 
»   '     ~      — 


:ea      ip::      :*::      :« 


asa     aa=     aas     are     a= 
2  3  * 


-k.    „*.    .-7_    _!L    _i.   — 


FIGURE    A3.       EXTERNAL   SURVEY   TRIP   CARD    FORM 


79 


ADMINISTRATIVE    RECORD 
(Confidential) 


Saapla  Phone  VtaaMri. 


Call-Backs _ 
Telephone  Ca 


Date 


Time 
ST 


-at 


Appointments 


Date 


Time 


_Ea. 


NOTES  AND  COMMENTS 
IntexTlenar 


REASON  FOR  BOS- INTKUVIEW 
OR  INCOMPLETE 


TMJCK  AND  TAXI 

TRIP  SUMMARY 

Car-rl    Nnmrn»r  

1 

T 
_l_ 

7r.no    Hiimhan- 

) 

1 

i- 

• 

t 

Hanaua    Tract    NunUner 

t 

• 

!k 

i» 

Bln"V    Miiaihe»r    ,.._ 

)l 

.11 

n 

finrnpla    wiimSeir 

ll 

Jn. 

_>L 

,"* 

_n. 

TVnval     rxat-a    ,      .,..._ 

u 

ri 

tt 

ii 

Address 

Telephone 
Garaging  Address 


1  —la  traeka  alalia  tira 

1-osle  Ami  tire  and  1  ac      '•     Sanaa 

an  nnln  alalia  salt  •■     twla_ 


Vehicle  Typei  »• 


Business-Industry  of  Owner 

Total         I   I   I   Total 


Trips  Reported 

s I  «  I  M  I 


Miles  Trav.  La. 


Total  Stop: 
Year Make 


Lie.  Cap. (Weight) 


State  License  Number . 
Optional  Codes 


Supervisor ' s  Comment . 


Peport  Coapletedi  . 

Interview  Cbaokadi . 


laBB) 


rtatsxi 

llaHUl) 


FOR 


Codlnf  Cheeked  byi 


IU1UU) 
(UitiaH 


TWO  TKIPS  PER  SAMPLE 

oct  eomiroATiosi  sheets 


Hfureai 


TRUCK  AMD  TAXIS' 


—J— 

Trip 

vvmbmi 

j. 

rereona 

la 

Vehicle 

-' 

Coaaodity 

M4CDC  DIB  7WI  TRIP  ItSIS 

(Origin) 

MX  SIS  DIM  TPIP  Ot 
IMiimilal 

Tina  01 

PoiH*ji   of   trip 

start 

Arrival 

Dimply 

•traat  Jhddreea 
City  Of  Town 

•traat  MSraaa 

m 

M 

•       Bona  aaaa         0 
1       Para.   Uaa         1 
1       Pal.   Ml.         1 
1       Nhal.   Ml.       1 
4       Ndaa.   Ml.        4 
9       sail  tap.         9 

City  at  Tona 

M 

PH 

Land  uaa 

Land  D«a 

7 
I 
* 

Mto.   Papra.     7 
Mr.   Pldar       • 
Pee.   uaa           9 

43 

44 

•9 

46 

47 

49 

49 

SO 

51 

51 

5i 

54 

i  ! 

55! 

5* 

57 

Mj 

$» 

to 

u 

u 

1    1 

u 

Ml 

ti 

M 

47 

s 

iS 

70 

71 

71 

_£ 

LJ  Empty 

street  addreae 
City  or  Tana 

I traat  nana  pea 
City  ot-  Town 

m 

It] 

0  Bonn  Baaa          0 

1  Para.  Uaa         1 
1       Ml.   Ml.         1 
1      anal.   Ml.       1 
4       aaaa.    Ml.        4 
9       Mail  Sap.         5 
4       Conatx.             4 

M 

pa 

Land  uee 

u 

las*  IM 

7 
I 
9 

ate.    Papra.      7 
Mr.    Pldar       • 

Ma.    uaa            9 

P— 

4) 

"I 

«5 

4« 

47 

46 

jsL 

& 

a. 

W 

an. 

n 

Mnr 

B 

St 

■ 

1 

H 

M 

<5 

H 

47 

4t 

41 

70 

71 

— 

E. 

FIGURE    A4.     TfTWCK/TAXI    TK,f»  CARD   FORM 


s: 


TK'UC*    4*0   T4XIS  TR\P   SEPO'RT  C OATH  SUA T  O* 


■ar-tr**;  V.r-c»*r 


O 


■ 


ton 


1^  -  ■■  - 


*    '~t'J    * '    '-'    '-    »-'      '"    _>■_**[*■*>'    >J    »'."''. 


O 


S  ;  r  »e :    Vrv: : 


:.-*f^    tejr-t<5 


•»       ■    >i    >«    >.'    >,    »- 


=  J       .   *       » 4        ■'/•'_ 


I        i*c      X&4*t        I 


>:     r*        w_-_      ... 


5*c       U^^c         * 


4     • '     > .     i        •  ; 


>;:«;    ***ir««j 


-  ;     **  :     w  .     «  •  • 


Citj    «    Tomi 


Clt*    J*    Im 


bl  <■     MC  • 


±LL1± 


\z 


-\, 


L  •  ••!•'.•♦ 


/v 


*.v< 


.      ii     .< 


■  i    <  *    » -•    » .     » ; 


>»    » ■    >■*    >* 


>;     #*:     W-v     •-• 


.      ■  ■ 


,' .  ;  •    ,-;     TV»» 


v.       M 

*n..        .-^ 


>*.      i  .-■*. 


"inn- 


z 


li^i 


•*.*'  ^»4     t*  /I  ^' 


*(!«•(      Wt««< 


CvCj    JSI     Tow* 


d%q  j«  **■*» 


.'"in 


•       n    • 


•\.» 


".'  V 


•a.       ta- 
Ml      |M|  i 


/VV' 


-^ 


-(!««(      WV.tl#*« 


| :  . -»*;     w-s/    KM 


t 


d%3|    M    **?«• 


TV*» 


i  m  *i .*-.»- .:»_*. 


^ 


13 


FIGURE    A5       TRUCK/TAXI    TRIP   CARD     CONTINUATION    FORM 


81 


I  I  I  e  B  £ 

1 1 1  ii ! i 

-   ->  *  J  z   z  b   & 


1 
; 
~i 
~i 
i 
& 
i 
i 
i 

3 
3 

J 
1 

3 

* 

* 

J 

J 

_]_                                                                3 

1  ;Jij      KB     8     SB     d~S     3     I     3 

sT »    5    l|  :    s|   s    a    s    3 

RRStttltss: 

» 11*    »»«js:    ssss 

*  7%,      »S*3*=      S*3« 

4 

1  ! 

•ill 

-M 

« J*      :    a    8    !    »    S     B    8    S    2 

2 

cr 
o 


Q 

a: 

< 
o 

>- 
<r 

< 
2 
2 

CO 


O 

z 


Ul 

a 

UJ 


JlSli? 


as 


i 


u 


1 


11 


ill! 


UJ 

> 
< 
u. 

< 


< 


o 

u. 


82 


Q. 


< 

z 

at 
Z 


h 


if 


>3 


'Ul 


11 

i 

■ 


1 

■ 


lilh 


f  1 


mm 


□ 


□ 


Xj 


nil! 


!Ui 


O-*.-.*.-,*-. 


Jills  If  It- 


-« - « **>.  * 


I  5 


liii! 


t  i 


1 


5P 


iliiiifl 


a 


liillllil 


«J 


!iii|i-d 


.*«*«*,»-.- 


I! 


E 


=11 


-  -J  -  .  »  C 


(  1 


1M 


lill 


□ 


□ 


Xj 


lllll 


!i!i 


llilllp 


i 

I   it 


!■  S 


E 


J     -J 


Ifll! 


t  I 


"I  Kill      :' 


? 


I 


i'i'i 


,o 


□ 


iilillllli 


iiiilsp 


ill      I 


E 


-ii 


=li 


ii        -i 


D 


niii  Ii!! 


H 


lliill^ 


E 


llll! 


i  i 


l\K  ttl 

Ml  MM 


.a 


□ 


Xj 


iiiiJi 


iiiiHH 


E 


2 

<T 

o 
u. 

o 
cc 

< 
o 


< 
z 
cr 

»- 
z 


UJ 


>- 

< 


LB 

a 

za  8 


< 

UJ 

<r 
o 


83 


APPENDIX  B 
GEOGRAPHIC  CODING  INDICES 


The  following  figures  are  photo-reductions  of  sample 
pages  of  the  two  sets  of  geographic  coding  indices  actually 
used,  one  from  Evansville  and  one  from  Lafayette.   The  street 
address  coding  index  is  used  for  determining  the  zone  number 
associated  with  a  given  numeric  street  address  (e.g.,  2208 
Arapahoe  Drive) ,  and  the  name-place  coding  index  is  used  for 
determining  the  zone  number  associated  with  some  location 
commonly  known  by  its  name  (e.g.,  Jefferson  High  School). 
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FIGURE     B4.        LAFAYETTE    PLACE-NAME    CODING    INDEX 
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APPENDIX  C 
ERROR  CHECKING  SPECIFICATIONS  AND  RESULTS 


For  each  of  the  travel  surveys,  the  various  error  checks 
made  in  the  conduct  of  this  research  project  are  specified 
in  tables  which  follow.   A  specific  error  number  was  associ- 
ated with  each  of  the  error  checks,  and  accompanying  tables 
show  the  number  of  cards  failing  the  given  error  check  in 
both  Evansville  and  Lafayette  and  the  associated  error  rates 
(per  cent  of  the  total  cards  in  which  the  given  error  check 
was  made  that  were  in  error) . 
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Table  CI. 

Home  Int 

Error 

Number 

A. 

1. 

Invalid 

2. 

Invalid 

3. 

Invalid 

4. 

Invalid 

5. 

Invalid 

6. 

Invalid 

permil 

7. 

Invalid 

8. 

Invalid 

9. 

Invalid 

10. 

Invalid 

11. 

Invalid 

12. 

Invalid 

13. 

Invalid 

14. 

Invalid 

15. 

Invalid 

16. 

Invalid 

17. 

Invalid 

18. 

Invalid 

19. 

Invalid 

20. 

Invalid 

21. 

Invalid 

22. 

Invalid 

23. 

Invalid 

24. 

Invalid 

25. 

Invalid 

26. 

Invalid 

27. 

Invalid 

schoo! 

28. 

Invalid 

29. 

Invalid 

30. 

Invalid 

31. 

Invalid 

32. 

Invalid 

33. 

Invalid 

34. 

Invalid 

35. 

Invalid 

36. 

Invalid 

37. 

Invalid 

38. 

Invalid 

39. 

Invalid 

B. 


Home  Interview  Survey  Error  Checks 


Cause 
Single  Field  Checks 

card  number. 

sample  number. 

origin  zone  number. 

destination  zone  number. 

origin  land  use  (blanks  not  permitted) . 

destination  land  use  (blanks  not 
tted) . 

from  purpose. 

to  purpose. 

zone  of  residence. 

person  number. 

trip  number. 

travel  date. 

start  time. 

arrive  time. 

sex  and  race  code. 

age. 

mode. 

driver's  license  code. 

number  in  car. 

kind  of  parking. 

first  work  flag. 

total  number  of  persons  living  at  the  D.U, 

structure  type  code. 

own/rent  flag. 

number  of  cars  garaged  at  the  D.U. 

number  of  persons  regularly  employed. 

number  of  persons  regularly  enrolled  in 
1. 

interview  status  code. 

total  number  of  trips. 

number  of  auto  driver  trips. 

number  of  persons  over  four  years  of  age. 

number  of  persons  making  trips. 

number  of  persons  not  making  trips. 

number  of  persons  with  unknown  trips. 

length  of  residence. 

household  income  code. 

code  for  race  of  occupants. 

parking  address. 

parking  duration,  rate,  or  fee. 

Consistency  Checks 


40.     Home  coded  for  both  from  purpose  and  for  to 
purpose. 
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Table  CI  continued. 

Error 

Number  Cause 

41.  Arrival  time  erroneously  preceeds  start  time. 

42.  Inconsistency  between  from  purpose,  residence 

zone,  and  origin  zone. 

43.  Inconsistency  between  to  purpose,  residence  zone, 

and  destination  zone. 

44.  Both  trip  ends  external  to  the  study  area. 

45.  First  work/school  trip  flag  inconsistent  with  to 

purpose. 

46.  Inconsistency  between  mode  and  kind  of  parking 

information. 

47.  Inconsistency  between  mode  and  number  in  car 

information. 

48.  Total  number  at  trips  less  than  number  of  auto 

driver  trips. 

49.  Number  of  persons  over  four  years  do  not 

correspond. 

50.  Total  number  of  persons  less  than  number  over 

four  years  of  age. 

51.  Number  of  persons  employed  and  in  school  exceeds 

total  number. 

52.  Total  number  of  trips  less  than  number  of  persons 

making  trips. 

53.  Inconsistency  between  status  of  interview  and 

trip  information. 

54.  Inconsistency  between  mode  and  parking  address, 

duration,  rate,  or  fee  information. 

55.  Trips  for  a  given  person  not  in  sequence  with 

respect  to  time. 

56.  Differences  in  tripmaker  information  for  a  given 

person. 

57.  Trips  made  by  a  given  person  not  numbered 

consecutively . 

60.  Differences  in  travel  date  and/or  residence  zone 

within  a  given  sample. 

61.  Total  number  of  trips  and/or  number  of  auto 

driver  trips  incorrect. 

62.  Number  of  persons  actually  making  trips  less  than 

recorded  number. 

63.  Incorrect  number  of  dwelling  unit  summary  cards 

for  a  given  sample. 
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Table  C3.   Partial  List  of  Other  Feasible  (But  Not  Made) 
Error  Checks  of  Home  Interview  Survey  Data 

1.  If  either  the  from  or  to  purpose  is  serve  passenger, 
mode  must  be  either  auto  driver  or  motorcycle. 

2.  Each  trip  having  either  a  from  or  to  purpose  of  serve 
passenger  (or  one  of  two  related,  consecutive  trips  having 

a  common  trip  end  purpose  of  serve  passenger)  must  have  more 
than  one  coded  for  number  in  car. 

3.  The  first  work/school  trip  flag  must  be  set  on  and 
only  on  the  lowest  number  trip  (for  a  given  person)  with 
an  appropriate  destination  purpose. 

4.  If  mode  is  walk,  then  to  purpose  must  be  work  (or 
school,  if  appropriate). 

5.  Kind  of  parking  code  should  be  compatible  with  the 
type  of  parking  available  at  the  destination/parking  zone. 

6.*   Kind  of  parking,  parking  rate,  and  parking  fee 
information  should  be  consistent. 

7.*   Parking  fee  should  be  reasonable. 

8.*   If  driver's  license  code  is  yes,  tripmaker's  age 
should  equal  or  exceed  the  minimum  legal  driving  age. 

9.*   If  driver's  license  code  is  yes,  tripmaker's  age 
should  not  exceed  some  arbitrary  maximum  value. 
10.*   If  mode  is  auto  driver  or  motorcycle,  driver's  license 
code  should  be  yes. 

11.*   Land  use  at  the  home  should  be  residential. 
12.*   On  consecutive  trips  by  a  given  person,  zone  number, 
purpose,  and  land  use  should  match. 

13.*   The  first  trip  of  the  day  for  a  given  person  should 
originate  either  at  home  or  external  to  the  study  area. 
14.*   The  last  trip  of  the  day  for  a  given  person  should 
terminate  either  at  home  or  external  to  the  study  area. 
15.*   The  highest  person  number  in  a  given  sample  should  not 
exceed  the  total  number  of  persons  living  at  the  dwelling 
unit. 

16.*   Length  of  trip  (time)  should  not  exceed  some  arbitrary 
maximum  value. 

17.*   The  race  code  from  the  dwelling  unit  summary  data 
should  be  consistent  with  the  concensus  of  race  codes  from 
the  internal  trip  report  data. 

18.*  On  a  sequence  of  trips  away  from  home  for  a  given 
person,  mode  should  not  change  from  auto  driver  to  some 
other  mode  or  vice  versa. 

19.*   If  the  trips  made  by  two  or  more  persons  within  a 
given  sample  parallel  each  other,  then  they  should  agree  in 
particulars  such  as  mode,  start  time,  arrive  time,  land  use, 
etc.  (this  particular  error  check  would  be  exceedingly 
difficult  to  implement  on  a  computer) . 

20.*   Number  of  persons  under  five  years  of  age  living  at  a 
given  dwelling  unit  should  not  exceed  some  arbitrary 
maximum. 

*  Exceptions  are  possible. 
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Table  C4. 

Externa 

Error 

Number 

7\ 

1. 

Invalid 

2. 

Invalid 

3. 

Invalid 

4. 

Invalid 

5. 

Invalid 

6. 

Invalid 

7. 

Invalid 

8. 

Invalid 

9. 

Invalid 

10. 

Invalid 

11. 

Invalid 

12. 

Invalid 

13. 

Invalid 

14. 

Invalid 

15. 

Invalid 

16. 

Invalid 

External  Survey  Error  Checks 


Cause 


Single  Field  Checks 

card  number. 

station  number. 

origin  zone  number. 

destination  zone  number. 

origin  land  use  (blanks  permitted) . 

destination  land  use  (blanks  permitted) 

from  purpose. 

to  purpose. 

number  of  persons. 

hour  of  interview. 

direction. 

vehicle  type  code. 

number  of  stops. 

entry  station. 

exit  station. 

vehicle  base  code. 


B.  Consistency  Checks 

30.  Home  coded  for  both  from  purpose  and  for  to 

purpose. 

31.  Internal  origin  zone  on  an  inbound  trip. 

32.  Entry  station  not  blank  on  an  inbound  trip. 

33.  Internal  destination  zone  on  an  outbound  trip. 

34.  Exit  station  not  blank  on  an  outbound  trip. 

35.  Entry  station  same  as  interview  station. 

36.  Exit  station  same  as  interview  station. 

37.  Exit  station  blank  on  an  inbound  trip  with  an 

external  destination. 

38.  Exit  station  not  blank  on  an  inbound  trip  with  an 

internal  destination. 

39.  Entry  station  blank  on  an  outbound  trip  with  an 

external  origin. 

40.  Entry  station  not  blank  on  an  outbound  trip  with 

an  internal  origin. 

41.  Inconsistency  between  from  purpose,  vehicle  base, 

direction,  and  origin  zone. 

42.  Inconsistency  between  to  purpose,  vehicle  base, 

direction,  and  destination  zone. 

43.  Invalid  hour  of  interview  for  given  station. 

44.  Inconsistency  between  vehicle  type  and  commodity 

codes . 


Table  C5.   External  Survey  Error  Rates 
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Error 

Evansville 

Lafayette 

Number 

Cards 

Per  Cent 

Cards 

Per  Cent 

in  Error 

in  Error 

in  Error 

in  Error 

1 

0 

0.00 

0 

0.00 

2 

42 

0.10 

232 

0.47 

3 

809 

1.99 

507 

1.03 

4 

795 

1.95 

312 

0.63 

5 

52 

0.13 





6 

61 

0.15 





7 

166 

0.41 

57 

0.12 

8 

189 

0.46 

56 

0.11 

9 

199 

0.49 

99 

0.20 

10 

30 

0.07 

49 

0.10 

11 

65 

0.16 

433 

0.88 

12 

120 

0.29 

102 

0.21 

13 

665 

1.63 

152 

0.31 

14 

40 

0.10 

253 

0.  51 

15 

36 

0.09 

32 

0.06 

16 

225 

0.55 

80 

0.16 

30 

131 

0.32 

111 

0.22 

31 

12 

0.03 

143 

0.29 

32 

34 

0.08 

248 

0.50 

33 

7 

0.02 

69 

0.14 

34 

29 

0.07 

16 

0.03 

35 

2 

0.00 

20 

0.04 

36 

2 

0.00 

14 

0.03 

37 

124 

0.30 

200 

0.40 

38 

24 

0.06 

50 

0.10 

39 

113 

0.28 

161 

0.33 

40 

11 

0.03 

125 

0.25 

41 

1,241 

3.05 

305 

0.62 

42 

1,002 

2.46 

331 

0.67 

43 

17 

0.04 

8 

0.02 

44 





445 

0.90 

Total  Cards 


40,721 


49,439 
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Table  C6.   Truck/Taxi  Survey  Error  Checks 


Cause 

Single  Field  Checks 

Invalid  card  number. 

Invalid  sample  number. 

Invalid  origin  zone  number. 

Invalid  destination  zone  number. 

Invalid  origin  land  use  (blanks  not  permitted) . 

Invalid  destination  land  use  (blanks  not 

permitted) . 
Invalid  from  purpose. 
Invalid  to  purpose. 
Invalid  garaged  zone. 
Invalid  trip  number. 
Invalid  travel  date. 
Invalid  start  time. 
Invalid  arrive  time. 

Invalid  business/industrial  classification  code. 
Invalid  licensed  capacity  code. 
Invalid  vehicle  type  code. 
Invalid  interview  status  code. 
Invalid  total  trips  reported. 
Invalid  total  stops. 
Invalid  total  miles  traveled. 
Invalid  commodity  code. 
Invalid  license  number. 
Invalid  number  in  vehicle. 
Invalid  block  number. 

B.  Consistency  Checks 

30.  Inconsistency  between  total  trips  and  interview 

status  fields. 

31.  Inconsistency  between  trip  information  and 

interview  status  field. 

32.  Total  stops  less  than  total  trips. 

33.  Home  coded  for  both  from  purpose  and  for  to 

purpose. 

34.  Arrival  time  erroneously  preceeds  start  time. 

35.  Inconsistency  between  from  purpose,  garaged 

zone,  and  origin  zone. 

36.  Inconsistency  between  to  purpose,  garaged  zone, 

and  destination  zone. 

37.  Both  trip  ends  external  to  the  study  area. 

50.  Total  number  of  trips  incorrect. 

51.  Trips  not  numbered  consecutively. 

52.  Trips  not  in  sequence  with  respect  to  time. 

53.  Differences  in  sample  information  (garaged  zone, 

travel  date,  etc.) 


Error 

Number 

1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 

11. 

12. 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20. 

21. 

22. 

23. 

24. 

25. 
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